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Background 



Silverbrook's bilithic Memjet'^'^ printheads are the taiBct printhcads for prinUng systems 
which will be controiled by SoPEC and MoPEC devices. 

This document presents the format and structure of these printheads. and describes the 
their possible arrangements in the taigct systems. It also defines a set of temis used to dif- 
ferentiate between the types of printheads and the systems which use them. 



Currently, this document is only concerned with the structure of the printhcads and their 
systems, with regard to the way in which dot data is loaded. 

Refer to the Bilithic Printhead Specification [IJ for the complete description of the fimc- 
tionality of these devices. 

This document relies on certain defmitions and details presented in Bilithic Printhead 
Specification [1]. 



It is intended that this document be used as a reference for engineers involved in the 
design work on the SoPEC and MoPEC projects. 



1.1 



Companion Documents 



1.2 



Readership 
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2 Definitions 

•niis document presents teminology and definitions used to describe the bilithic printhead 
systems. These teims and definitions are as follows: 

• PriothCjMl TVpe - There are 3 panuneters which define the type of printhead used in a 
system: 

• Direction oftbe data flow through the printhead (clockwise or anti-clockwise with 
the printhead shooting ink down onto the page). * 

• Location of the left-most dot (upper row or lower row, with respect to y^. ). 

• Printhead footprint (type A or type B, characterized by the data pin being'on the left 
or the nght of V^.^ where FV is at the top of the printhead). 

• Printhwd Arranffcrnt^t - Even though there are 8 printhead types, each arrangement 

has to use a specific pairing of printheads, as discussed in Section 3. This gives 4 
pairs of printheads. However, because the paper can flow in either direction wifli 
respect to the printheads. there are a total of eight possible arrangements eg 
Anangeraent I has a Type 0 printhead on the left with respect to the paper flow and 
a Type 1 pnnthead on the right. Arrangement 2 uses the same printhead p^ as 
Arrangement 1 , but the paper flows in the opposite direction. 

• Color Q is alvrays the first color plane encountered by the paper. 

• Eatfi is defined as the nozzle which can print a dot in the Icft-most side of the page. 

• Thg Ev^n Plan^ of a color corresponds to the row of nozzles that prints dot 0. 

Note that throughout this document, where the various printheads and systems are pre- 
sented, the pnntheads Osass shoot ink down onto the page. 

Figure 1 shows the 8 different possible printhead types. Type 0 is identical to the Right 
Pnnthead presented in Figure 3 in [1], and TVpe 1 is the same as the Left Printhead as 
denned in [Ij. 



ConfidenUal 



October 21. 2002 



4 



1I> 



Saverbrook Research 



SoPEC/MoPEC Bilithic Printhead Reference 



4-4-l-6-v1.0draft 




Color n 



o o • 



Type 0 printhead 



o oo 



Type 1 printhead 



Color n 



Type 2 printhead 




Color n 



-e-e- 



o o>o 



iyp6 3 printhead 

v+ 



Color n 



c w o o 



lype 4 printhead 




Color n 



-©-e- 



Type 5 printhead 

v+ 



Q 0>0 

O O 




lype 6 printhead 



lype 7 printhead 



Figure 1. Printhead lypes Oto 7 

t!?I2! n "^""^'T ^T^^ °f ^'"h printhead type, with respect 

to the flow ofpaper, for the 8 possible ammgcments. 



Table 1 . Definition of the different printhead arrangements 



^^^^^^^^^^^^ 






Arrangement 1 


Type 0 


Typel 


Arrangement 2 


Type 1 


Type 0 


Arrangement 3 


Type 2 


Type 3 


Arrangement 4 


Types 


Type 2 


Arrangement 5 


Type 4 


Type 5 


Arrangement 6 


Type 5 


Type 4 


Arrangement 7 


Type 6 


Type 7 


Arrangement 8 | Type 7 


Types 
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3 Bilithic Printhead Systems 

When using the bilithic printheads. the position of the power/gnd bars coupled with the 
physical footpnnt of the printheads mean that we must use a specific pairing of printheads 
together for pnnting on the same side of an A4 (or wider) page, e.g. we must always use a 
Type 0 pnntfaead whh a lype 1 printhead etc. 

While a given priming system can use any one of the eight possible arrangemenU of print- 
heads, this document only presents two of them. Arrangement 1 and Arrangement 2 for 
purposes of Ulustration. These two arrangements are discussed in subsequent sections of 
this document However, the other 6 possibilities also need to be considered. 

The main difference between the two printhead anangements discussed in this document 
is the du-ecdon of the paper flow. Because of this, the dot data has to be loaded differently 
in Arrangemwit 1 compared to Arrangement 2. in order to render the page correctly 
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3.1 Example 1 : Printhead Arrangement 1 



Figure 2 shows an Arrangement 1 printing setup, where the bilithic printheads are 
arranged as follows: 

• The Type 0 printhead is on the left with respect to the direction of the paper flow. 

• The Type 1 printhead is on the right. 



type 0 Printhead 




Type 1 Printhead 



CNQ O 

0 2 4 



1 3 i 

KB O O - 



Color 5 



+1 ia+3 m+S 

o o 




-Color 5 



Q O 

n-* n-4 D-2 

o-S D.3 n-1 

— e OO 



CMO O 



Color 4 



m+2 m+4 
^-Im+a m+5 

o o 



Color 4 



Q Q 

D-6 11-2 
o-S n-3 D-l 

— O O m o 



CMO O 



Color 3 



O m O O 



1 3 5 

K B Q Q 



Color 2 



Color 3 



CoJor 2 



Q Q - 

n-6 iHl n-2 

D-3 n-3 o-l 

— Q oe 



o o o 

D-O D-4 D-2 



tJ-3 D-3 D-l 

O OO 



o o o 

n-6 n-4 a-2 



Gnd 

A ▲ 



The printheads are facing downwards. 
The ink is being shot down onto the oace Direction 

^ ^ of Paper Flow 



Figure 2. Identincation of printheads nozzles and shfft-reglster sequences for 
printheads in Arrangement 1 
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Table 2 lists the order in which the dot data needs to be loaded into the above printhead 
system, to ensure color 0-dot 0 appears on the left side of the printed page. 

Table 2. Order in which tho even and odd dots are loaded for printhead Arrangement 









Odd 


Loaded second in 
descending order 


Loaded first in 
descending order. 


Even 


Loaded first \n 
ascending order. 


Loaded second in 
ascending order. 



Figure 3 shows how the dot data is demultiplexed within the printhcads. 



Type 0 Printhead Type 1 Printhead 



Data[l]. 



Data[0]. 




Data[0] 



Data[l] 



Figure 3. Demuftlplexlng of data within the printheads in Arrangement 1 

Figure 4 and Figure 5 show the way in which the dot data needs to be loaded into the print- 
heads m Arrangement 1, to ensure that coior 0-dot 0 appears on the left side of the printed 
page. ^ 



Data[l] 
SiClk 




Figure 4. Signalling for a Typo 0 printhead in Arrangement 1 



Figure 5. Signalling for a Type 1 printhead In Arrangement 1 
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3-2 Example 2: Printhead Arrangement 2 

Figure 6 shows an Arrangement 2 printing setup, where the bilithic printhcads arc 
arranged as follows: 

• The Type 1 printhead is on the left with respect to the direction of the paper flow. 

• The Type 0 printhead is on the right. 



The printheads are facing downwards. 
The ink is being shot down onto the page. 



Type 0 Printhead 



t T 

Direction 
of Paper Flow 
V+ 



lype 1 Printhead 




Gnd 



Figure 6. Identification of printheads nozzles and shift-register sequences for 
printheads in Arrangement 2 
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Table 3 lisU the order in which the dot data needs to be loaded into the above pnnthead 
system, to ensure color 0-dot 0 appears on die left side of the printed page. 

Table 3. Order in which the even and odd dots are loaded for printhead Arrangement 



^^^^^^ 




^rypov|tRrlnthea'«|?g 


Odd 


Loaded first in 
descending order. 


Loaded second In 
descending order. 


Even 


Loaded second in 
ascending order 


Loaded first in 
ascending order. 



Figure 7 shows how the dot data is demultiplexed within the prinflieads. 

"Sssr 

li 

Type 0 Printhead Type 1 Printhead 



Data[l] 



Data(0] 




Dala[0] 



Datafl] 



Figure 7. Demultlprexing of data withrn the printheads In Arrangement 2 

Figure 8 and Figure 9 show the way in which the dot data needs to be loaded into the print- 
heads in Arrangement 2. to ensure that color 0-dot 0 appears on the left side of the printed 



page 



Data[0] 
Data[l] 

Figure 8. Signalling for a Type 0 printhead In Arrangement 2 




Figure 9. Signalling for a Type 1 printhead in Arrangement 2 
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3.3 Conclusions 



Companng the signalling diagrams for Ammgement 1 with those shown for Aimigcmcnt 
2 .t can be seen Aat the color/dot sequence output for aprinthead type in Anang^enU 

which the color plane data is output, as well as whether even or odd data is output first 

* " """" ~" ""'^"'^"^ 

Lrr^^^i^ '"''7'*" '^^^ P'""-^ ^••i^'' >o^ed first {i.e. 

f^onJ . ^ ' ? airangcment Also, the order in which the dots have to be 

loaded (e.g. even ascending or descending etc.) is dependent on the anadgement. 

If the device controlling the printheads can reorder the bits according to the following en- 
ter,^ then ,t should be able to operate in all the possible printhead aiLgements: ^ 

• Be able to ou^ut the even or odd plane first 

' ^ndtnUy ^^'^ ^ ascending or descending order, inde- 

* St Jrintheld''""* ^ *° "'^^^ '° 



Arrangemen t 1 



Arrangement 2 



. Paper 



U Arrangement 3 






S*^ SB iW*.© 




Papc 







P^er 






u 
















Airangement 4 




Paper 




V* 

1 






e= , 

Catar* 




OMtra 








Arrangement 6 




Paper 






ii 




Oi^a 1 Wo tanl . 






-? " 


Arrangement 8 




Paper 








T- 


*I-<MM> S= o- 


s&r 











Figure 10. AN 8 Printhead Arrangements 
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Table 4- Order in which even and odd dots and planes are loaded into the various 
printftead arraiigemenls 









Arrangement 1 


Even ascending loaded first 
Odd descending loaded second 


Odd descending loaded first 
Even ascending loaded second 


Arrangement 2 


Odd descending loaded first 
Even ascending loaded second 


Even ascending loaded first 
Odd descending loaded second 


Arrangement 3 


Odd ascending loaded first 
Even descending loaded second 


Even descending loaded first 
Odd ascending loaded second 


Arrangement 4 


Even descending loaded first 
Odd ascending loaded second 


Odd ascending loaded first 
Even descending loaded second 


Arrangement 5 


Odd ascending loaded first 
Even descending loaded second 


Even descending loaded first 
Odd ascending loaded second 


Arrangement 6 


Even descending loaded first 
Odd ascending loaded second 


Odd ascending loaded first 
Even descending loaded second 


Arrangement 7 


Even ascending loaded first 
Odd descending loaded second 


Odd descending loaded first 
Even ascending loaded second 


An-angement 8 


Odd descending loaded first 
Even ascending loaded second 


Even ascending loaded first 
Odd descending loaded second 
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Bi-lithic Printhead Specification 



1.0 Basic Requirements 

To create a two part printhead, of A4/Letter portrait width t«P^\fP^Sem2 seconds, 
by "Stitching'* reticle images. 

The memje, a«z.es tave . horizonU, ^^'"".r-XZ'-^"^^^^^^' 
ket as 1600 dpi. 

The first nozzle of the right chip should have ^^J^^^^f^^^^^Ztf^' 
nozzle of the left chip for the same color row. There is no mk nozzie ov p 
same colour) scheme employed. 

1.1 Power Supply 

VddA^pos andC3n,undsupply is made through 30 urn wide pa J aj^^^^^^ 
ll^^g conductive adhesive to bus bar bes^e^e c^^^^^^^ 

(12V was considered for Vpos but routmg of CMOS Vdd at i.3 v woui 
over the length of the chips, but this will be revisited). 

1.2 MEMS cells 

The current memjet device requires 180nl of^^^r^ ^^^J^^^ 
■ 1 usee. Assuming 95% efficiency, this requires a 55 ohm actuator ara g 

during this pulse. 
1^.1 ISSUE!!! 

time. That is about 8 Amperes if all nozzle fire. 

That is 8 Ampexes is for only 1 colour! 16A * 6 colours = 96 A for all colours. 

Howmany colours couldprintatthe same time. C5^« 

ours at the time are required to create ^^^yj^^^^y^^^f^ of InfraRed ink, 
ground). But the fixative mk us also required, and 12 /o cov^ge 



1.2.2 64um unit cell height 

This cell would have 4 line spring bet««n ft. odd and cvo„ dots, »d 8 line spacing 
between adjacent colours. 

1.23 80 um unit cell height 

This ceU would have 5 line spacing bc*».e. ft. odd and even dots, «.d 10 line spacing 
between adjacent colours. 

1.3 Versions 

1.3.1 6 Colour 1600 dpi with 64 um unit ceU 

Left and Right Chip. This version wiU not be prototyped. 

1.3.2 6 Colour 160Q dpi with 80 um unit ceU 

Left and Right Chip. 

1.3.3 4 Colour 800 dpi with 80 um unit ccU 

For camera application. Single nozzle row per colour. 
This version will not be prototyped. 

1.4 Air Supply 

Air must be supplied to the MEMS region through holes in thechip. 

2.0 Head Sizes 



Enough to provide 

TABLE 1. Head Combinations 



Left Head 



Right Head 



TABL£ 1. Head C ombin a tions 
Left Head 




1 io**> oc rr^^ritch Parts" -n*l 18+104)*12. Nozzles per row 
wafer layout, manages to avoid this set, without any loses. 



3.0 Interface 



TABLE 2. I/O pins 



Nam e 

7ata[0'IJ 



DataLfO-JJ 



SrClk 



ReadL 



FrCIk 



ncL 



I/O 



Fu nction 

Dot data for colours 0 - 5, using Differential Signalling 
(DataL the complemcntaiy signal). colourslO-21 on 
Data[0], colour[3-Sl onData[ll 



Common 

|No 



Feedback for CMOS testing {L5yncL-l, ReadL^) 
and (LSyncL^, ReadL^O) 

0] - nozzle test result 

1] - temperature 



Feedback for CMOS testing {LSyncL=l . ReadL^) 
and {LSyncL^, ReadL^) 
0] - nozzle test result 
[1] - temperature 



Max 
Speed] 

300 



Dot data shift clock using Differential Signalling 
(SiClkL the complementary signal) 



No** 



600^ 



Yes 



Pulse Profile for all colours 
0 - Capture dot data for next print line 



0.1^ 



" rT^5i5i5^^ common, but tor timmg/electrical reasons should run point to point, 
b 300 MHz clock, so edges are 600 Mh2 rate 

•c. 1 MHz cycle, but the resolution of the mark/space ratio may require 50 ns. 
d 1 0 kHz cycle, with minimum low pulse of 10 ns (no maximum), 
controller (SOPEC). 
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3.1 Dot firing 

Tofl«.nozzle,*rec signals a«need.Ado.d«a,atos,gna,..ndap,.«e. Who, an 
signals arc high, the nozzle wUl fixe. 



FIGURE 1. Print head structure 



T3 
(0 
O. 



^^^^^^ 




and clocked into the cliip w«h SrCJfc The dot data tsmu ^ ^ ^^^^ 

^ Sl^ si;lL» the dot pattern b, the Jlot latch ,s keen fired. 

Acoss thetop of a cow -"^^-SLl-^er^^^^ 

again with one register bit in each direcUon flow. 
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FIGURE 2. Column Structure 



Cotumn H 



Dotr21. 
rClk2 

I 

Dot[11 
SrCIki 

Dot(0] 
SrCiko' 




Th. se,ec register fonns .h. Select f f^^triXS^^tsSI?.: et^^^^ 



selects the reverse direction fire register, 
whole colour row at the same tune (with a siigm prop b 
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w AA^t «hiR rcoster dot mapping to page 
FIGUBE 3. Pri nt head dot shiB regisie. 

1 Paper Movement 
Ink shooting out of page l*-"*;^""**;^ heads 
Reader looking through paper over p 

5 3 1_ 



n-l 




A-A Through Even nozzles 




paper 



4 2 0 



RiiiS^riSTHiid^^^S^^S^ 

• th« following data streams wUl need to provided. 
With this mapping, the foUowmg 

^ . . . H^d Combinat.--.- -h,. natterns (n^l3SZ4)_^ ^..Head 

Le ft Head 

I . , dot order 

ISiMi n-m 



7 1 „1 dot order " .,4075.4077.40 /y.J > ' -^l 

u^uoi,tv -rrz — TTZTTToT^dOSA line 



(CO, CI , ^^)-) „rintinB mode. Note SrClk 
pulses (and 3L+1 rising edges). 



FIGUKE 4. Data Timing During Printing 




LSy ncL j 

3J Fire Shift Register MiofTsthen 

that(4800A)i ^ da-O'inaU 

FIGURE 5. Print quaUty 




OOOOOOCXXXDOOXOO 



OOCXXXXXXXXXXDOOO 



£- rri*> at the same time starting 
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To achieve this fire pattern the fire shift register and select shift register need to 
be set up as show in Fig\ire 6. 

FIGURE 6. Fire and Select Shift Raster setup for printiDg 

fet^xifiroito^^^oaota^ftOfto^^ — jCir* shift 

..OOOOOOO 0001111111 1110000000 OOOlllllll 111 •ol«ot -hlft roB 



The pattern has shifted a * 1' into the fire shift register every n'*positions (where « is 
usually is a minimum of about 100) and n ' 1 's, followed n 'O's in the select shift 
register. At a start of a print cycle, these patterns need to be aligned as above, with the 
"1 000 .." of a forward half of fire shift register, matching an n grouping of ' 1' or 
•O's in the select shift register. As well, with the "1000..." of a reverse half of the 
fire shift register, matching an n grouping of '1' or 'O's in the select shift regis- 
ter. And to continue this print pattern across the butt ends of the chips, the select 
shift register in each should aid with a complete block of n ' Ts (or 'O's). 

FIGURE 7. Fire Pattern across butt end of Print Chips 

. . .1110000000 . . . .0001111111. . . .111 1111111- . . .1110000000 . . . .0001111111 

Vatt. Print Head Flre/Sel«ct Risbt Print Head Flro/Soloct SR 



Since the two chips can be of different lengths, it makes initiaUsation of these pattern 
difficult. This is solved by building initialisation circuitry into chips. This circuit is 
controUed by to i«gisters, nlen(14) and count(14) and b(1). These registers are 
loaded serially through DatafOj, while LSyncL is low, and ReadL is high with FrClk. 



FIGURE 8. Fire Pattern Generation 




nlen 



T 



count 



FS_INIT 
clocked by 
& gated FrClk 



serial load path enabled by Scan 



fire shift regigter 

clocked by fsclk a gated h'rClk 



select shtft register 



clocked by SelClk a gated FrClk 



The scan order from input is b, n[13.0],c[0-13l, therefore b is shifted in last. 
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The following table shows the values to programme the bi-lithic head pairs using a fire 
TABLE 4« Head Combinations Initialisation for ii=100 



Nozzles 
La 


Nozzles 
Lb 


nlen(A&B)-' 
n-1 


county = 
(L/J2) mod n 
-1 


bA 


bB 


rem^ 
(Lq/I) mod n 


' counts^ 
(LA-LB+r«»i) mod n 
-1 


9744 


4080 


99 


71 


0 


0 


40 


3 


8328 


5496 


99 


63 


0 


0 


48 


79 


6912 


6912 


99 


55 


0 


0 


56 


55 



and once the registers are initialised with LA FrClk cycles (ReadL='0', LSyncL=' 1 
rem would be the correct value for countg if chip B was only clocked (FrClk) Lg 
times. But this chip will be over clocked L^-Lg cycles. The values of by^ and are 
either the same or inverse of each other. The actually value does not matter. They need 
to be different from each other if the select shift registers would end up with differ- 
ent values at the butt ends. If (La/2/2) is even (and county is non zero), then the final 
run in *A's select shift register will be Ib^. If (L/^-L^fl) mod n is even (and counte is 
non zero) then the final run in *B*s select shift register will be Ibg. 



FIGURE 9. Determining Select Shift Reg^ter valne 

HtM4A 



L,jJ2 select shift register length 



. count^'^l 



HcadB 



II 




< 


^ 


V M .^l^^A. .fekt A ^^^^^^ t^^^l. 



. count^M 



3.4 Profile Pattern 

A profile pattern is repeated at FrClk rate. It is expected to be a single pulse about lus 
long. But it could be a more complicated series of pulse. The actual pattem depends on 
the ink type. 

The following figure show the external timing to print a line of data. In this example 
the line is printed in 8 cycles of FrClk, 
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FIGURE 10. Timing for printing Signals 

-4 



LsyncL 

ReadL 
Data 

SrClk 
FrClk 

Pr 



njn_jiJTJ\Ji_rLrL 



ns 



3.5 Interface Modes 

The print heads a eight different modes controlled by signals ReadL and LSyncL, As 
seen in Figure 9 with both LSyncL and ReadL high, the chip in normal printing mode. 
Some of these mode can operate at the same time, but may interfere with the result of 
the other modes. 



TABLE 5. Print Head Modes 



ReadL 


LSyncL 


Mode 


Internal 
Mapping 


1 


1 


Normal Print Mode 


SiClk=SrClk/3 

frcllc^FiClk 

SelCIk=0 

FsClk^FiClk 

Scan=0 

CoreScan=0 


X 


0 


Dot Load Mode 

• Dot latches are open, loaded with Dot shift regis- 
ters, latch once LSyncL returns to 1 (this happens 
regardless of ReadL) 

• Enables Dot Shift register to capture fire result. 




1 


0 


Fire Load Mode 

• Data[0] will shift through nien, count and b with 
FrClk 


SrClk^X 

frclk^X 

SelCIk*=X 

FsClk«FrClk 

Scan=l 

CoreScan*=X 
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TABLE 5. Print Head Modes 



ReadL 


LSyncL 


IVfode 


Internal 
Mapping 


0 


1 


Reset No2aJe Test 

• Resets the state of nozzle test circuit 


SiC:ik=SrClk 

FrClk=FrCIk 

SelClk=FrClk 

FsClk^FrClk 

Scaii==0 

CoreScaii=»l 


0 


I 


L^iviv^o tcsnng mooe 

• The contents of the dot shift registers are serial 
shifted out on Data [0-1] with SrClk 


0 


I 


Fire Initialise mode 

• The contents of the fire shift register and select 
shift register is generated with FrClk 


0 


0 


Temperature Output 

• The series ofDelta Sigma output are clocked out on 
Data[0] with FrClk, The sum of these bits represent 
the temperature of the chip. 


frclk=0 

SelClk=0 

FsClk=0 

Scan=0 

CoreScaii=X 


0 


0 


Nozzle Test Output 

• The result of a nozzle test is ou^ut on Data[ 1 ] . 



3.5.1 Printing 



Figure 10 shows show timing for norma! printing. During this action, we drop out of 
Normal Print Mode, to Dot Load Mode between hne transfers. For printing to perfonn 
correctly, no other signal should be stable. 

3.5.2 Initialising for Printing 

To initialise for printing the fu-e shift registers and select shift registers need to setup 
into a state as shown in Figure 7. To do this the chips are put into Fire Load Mode and 
the values for nlen, count and b are serially shifted from Data[0] clocked by FrClk. 
As the two chip have separate Data line, and common FrClk, this happens at the same 
tune. Once this is done, mode is changed to Fire Initialise Mode, and fiirther FrClk 
cycles are provided to both chips. During all these operation Pr should be low, to pre- 
vent unintentional firing for nozzles. 
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FIGURE 11. IniHalising Print Heads 
LsyncL 



ReadL 

DataAlO] ( bA, lneii[13^), coont[(K13lA > - 
Dataara ( bp, lncnI13^1, comit[0-l3|B > - 



SrClk 
FrCIk 

Pr 



Fire Load Mode 



m 



ML 



L^cydcs 



Fire Initialise Mode 



3.5.3 Nozzle Testing 

Nozzle testing is done by firing a single at a time a monitoring the Datafljpm in the 
Nozzle Test Output mod^. 

Each nozzle has a test switch with closes when it nozzle is fired. All 12 switches in a 
nozzle column are connect in parallel to the following circuit. 



FIGURE 12. Nozzle Test Latching Circuit 
LSyncL&fReadL 



Testout 



Switch node 




Testin 



Vdd 



jr 



This circuit is initialised when ever LSyncL is high and ReadL is low (Reset Nozzle 
Test mode). This forces all "switch nodes" to low, and the feedback through lower NOR 
gate will latches this value. With LSyncL low and ReadL still low (Nozzle Test Output 
mode) the Testout of the first nozzle column is output on DataflJ. If any switch is 
closed, the switch node of this colunm will be pulled up, and will ripple through to the 
output as transition from high to low. 
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FIGURE 13. Nozzle Testing 
LsyncL 



ReadL 
FrCIki^ 



Pr 



Set up Test 



Reset Nozzle Test Mode 



A. 



Nozzle Test Output 
Mode 



Setup 
Test 



Nozzle testing requires a setup phase in order to fire only one nozzle. There are many 
ways to achieve this. Simplest might be to load a single colour with 101010 through the 
even nozzles, and 010101... for the odd nozzles (O's for all other colours), and set up a 
fire pattern with n = La/2. With this fire pattern only one nozzle will fire in each Pr 
pulse. After fuing in Nozzle Test Output mode, a single FrClk will advance to next 
nozzle, then Reset and Test, After l.pJ2 cycles of this testing, a single SrClk will 
advance the dot shift registers to setup the untested nozzles of this colour, and another 
La/2 cycles of FrClk, Reset and Test will finished testing this colour. Then repeat test 
procedure for other coloxirs. 

3»S.4 Temperature Output 

This mode is not well defmed yet. In this mode, DatafOJ will output a series of ones 
and zeros clocked by FrClk. After a (currently unknown) number of FrClk cycles the 
sum of this series wiU represent the temperature of the chip. Clocking frequency in this 
mode it expected to be in the range lOkHz - IMHz. 
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FIGURE 14. Temperature Reading 
LsyncL 1 



ReadL [_ 
DatalO] — Q 

SrClk 

FrClk 

Pr 



The Frequency of FrClk and the number of cycles need to be programmable. Since this 
mode cycles FrClk, the result of fire shift register and select shift register would be 
changed, but in this mode FrClk is disabled to these circuit. So printing can resume 
without reinitialising. 



3.5.5 CMOS Testing 



CMOS testing is a mode meant for chip testing with before MEMS as added to the 
chip. This mode allows the dot shift register to be shifted out on the Data[0-1] pins. 
Much like the nozzle test mode, the nozzles are fired while LSyncL is low, but during 
the firing SrClk will be cycle, and the dot shift register will load the signal that 
would fire the nozzle. Once capture, the result can be shifted out. 



FIGURE 15. CMOS Testing 
LsyncL 



ReadL 

Datal 

SrClk|! 
FrClkn 

Pr 



Set up Test 



Dot Load Mode 



CMOS Test Output Mode 



The Dot Load Mode above violates normal printing procedure by firing the nozzles 
{Pr) and modify the dot shift register (SrClk). 
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4.0 Reticle Layout 

To make long chips we need to stitch the CMOS (and MEMS) together by overlapping 
the reticle stepping field. The reticle wiU contain two areas: 

FIGURE 16. Reticle Layout 




V J 



The top edge oiArea 2, pad end contains the pads that stitch on bottom edge o^Area 1, 
CORE. Area 1 contains the core array of nozzle logic. The top edge of Area 1 will stitch 
to the bottom edge of itself. Finally the bottom edge of Area 2, butt end will stitch to 
the top edge of Area 1. The butt end to iised to complete a feedback wiring and seal 
the chip. 

The above region will then be exposed across a wafer bottom to top. Area 2, Area /, 
Area 7...., Area 2. Only the PAD end of Area 2 needs to fit on the wafer. The final expo- 
sure fo Area 2 only requires the butt end on the wafer. 
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FIGURE 17. Stepper Pattern on Wafer 




4.1 TSMC U-Frame requirements. 

TSMC will be building us frames 10 nun x 0.23 mm which will be placed either side of 
both Area I mdArea 2. 

TSMC requires 6 mm area for blading between the two exposure ar^. This translates 
tl a^ on ^ie reticle, as some recticles are 2x size, while most are 5x. the wo«t case 
must be used. 
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1 Introduction 



1.1 



Document History 









1.6 


29 November. 2002 


Simon Walmsley 


UpCaiea Oni^M W wnyw^ w 

cols document, got rid of 68k reference now 
thai we are using LEON. 


1.5 


26 November, 2002 


Simon Walmstey 


Added description of storing more than a sin- 
gle SoPEC Jd key In a PRINTER^QA On sec- 
tion 3.5.3 and related). This reduces the cost 
of a multi-SoPEC system wHh no loss of secu- 
rity. 

Also added text to describe that batch keys 
can be different for each SoPEC if the indirect 
upgrade key protocol is used. 


1.4 


9 September, 2002 


Simon Walmstey 


Added section in requirements detaiDng ^pes 
of attacks we care about and don't care about. 


1.3 


30 August, zuu^ 


Simon Walmsley 


Changed ComCo.OEM_xxxx variables into 
simply xxxx variables, since that Is more 
generic. Added text regarding ink refill. Added 
extra software authentication stage to prevent 
ComCos from fWdllng with SoPEC software. 


1.2 


29 August. 2002 


Simon walmsley 


Added section on how the PRINTER_QA chip 
gets programmed v»nth the SoPEC^ld.key. 


1.1 


28 August 2002 


Stmon Walmsley 


Updated to have Ink and operating parameters 
t>e authenticated via symmetric key based sig- 
natures based on a unique SoPEC^Id. 
Updated after review. 


1.0 
0.2 draft 

0.1 draft 


27 August, 2002 
26 August. 2002 

26 August, 2002 


Simon Walmsley 
Simon Walmsley 

Simon Walmstey 


Changed publte-key and private key refer- 
ences to asymmetric & symmetric respec- 
tively, so private can now sub-refer to the 
private key of the asymmetric pair, or the sin- 
gte private symmetric key. Changed OEM.Id 
into ComCo„OEf^«license_id to more accu- 
rately reflect the scope of the W. 

Initial issue. 



1.2 
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1,3 SCOPE 

This document describes the basic security requirements of programs nmning on the 
SoPEC ASIC [1]. It then describes an implementation solution to the security require- 
ments. 
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. . *u A^.ifrr> nf the SoPEC ASIC as well as implying key 
ated authentication protocols [5]. 

document. 
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1.4 READERSHIP 

This document is written for software engineers and ^^^^^ f^J^.'^^rST^ 
w^h SoPEC as well as PCB designers that are responsible for SoPEC-based l^nt 
Tn^es A^inSar ience working on PEC and PEC-based Print Engmes may also fmd 
document useful. 

nns document is also intended to be read by those responsible for key management and 
associated database designers with regards to guiding requirements. 

TOs document is confidential to Silveibrook Research 

side this organisation fflHiJbe covered by a non^sclosure agreement (NDA). 

1 5 OA Chip Terminology 

The Authentication Protocols document [5] refers to QA Chips by their function in partic- 
ular protocols: u r\A 

. For authenticated reads, ChipR is the QA Chip being «ad ^om, «nd ChipT is the QA 

Chip that identifies whether the data read firom ChipR can be tnisted. 
. For replacement of keys. ChipP is the QA Chip being ^^S^^^^^l^^^^' 

and SpF is the factory QA Chip that generates the message to program the new key^ 
. For upgrades of daU in memory vectors. ChipU is the QA Chip being upgraded, and 

Chips is the QA Chip that signs the upgrade value. 
Any given physical QA Chip will contain functionality that allows it to operate as an 
entity in some number of these protocols. 

Therefore wherever the terms ChipR. ChipT. ChipP. ChipF. ChipU and ChipS are used in 
Ss d^rS 1 referring tS/ogi^/ entities involved in an authentrcatton protocol 

as defined in [5]. 

pScal In the same way, the QA Chip inside the pnnter is referred to as 
PRINTER^QA, and will be on a separate bus to the INK.QA chips. 
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2 Requirements 

2.1 SECURITY 

The basic functional security requirements are: 

- Silverbrook code and OEM program code co-existing safely 

. Silveibrook operating parametere authentication 

• OEM operating parameters authentication 

• Ink usage authentication 

Each of these is outlined in subsequent sections. 

The auOwntication requirements imply that: .. ^ ^ „™„ 

? OEMS and end-users must not be able to replace or tamper .vith Srlverbrook program 

. SfM^rend-usersmustnotbeabletocallunauthorizedfimctions.^^^ 

. Sd'l^mustnotbeabletoreplaceortamperwithOEMpiogramcodeordata 
^d^mS'tbeabletolunauthorizedf^ctionswitlunOEMprogra^^^ 

. ST^ust be able to test products at their highest up^ble stams. yet not be able 
to sbio them outside the terms of their license -^^w 

SJh^xreSon of operating system permitted GPIO pins and tm.ers. 
, 1 1 Silverbrook code and OEM program code co-exlstlng. safely 

activated. 

• for SoPEC is a form of protection management, whereby Sil- 

be restricted to Silverbrook program code only. 
212 Silverbrook operating parameters authentication 
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program code 



— — — 

H..^. OEM b. of •-"'S-'^;^ .b. PH., E,«i~ « «» 

opgradea a«us beton: sdlins the Pri« Eng«« » fl» 

. must not be able to tamper with or leplace ObM P^^T . ^ 
S.le to tamper with the PEP blocks or sennce-related penpherals. 

2.1.4 ink usage authentication end users according to a business model. For example. 

Each OEM sells printers and ^^^'^ ^'^'^^^^^^^^^^^qeU^ may sell the same featured 
OEM. may provide i^ - SA^^^^^^^^ 

printer at a higher pnce $A+$X, ana pro^«« -ad-users of OEM, printers can only use 
of OEM2 printers can only use OEM2 ink. 

2 2 ACCEPTABLE COMPROMISES 

Since is no p,o««io. Pby*»ay -^^JS'^'CS^e^ 

code etc. It is impossible to guard against such an attack, 
we are really orUy -cerned wi^con^^i^ -^^^^ 

of printer operating parameter '^^'^^^V^^ S^d by one that can be down- 



of the license agreement. 



I. a fraiJdng machine prints stamps 
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upgrading the single pnnt engine only. y<^^ ^^^^^^ u doesn't mean we have 

Engine, that is an acceptable ^^^^'^^^^^[eompUsh this, 
to make it totally simple or cheap for the end-user to acc ^ 

o c;n/.j. thev can be transmitted via the inter- 
Software-only attacks are ttie "^^^If^.S^^^f^Son attacks are far less problematic, 
net and have no perceived ^°'^.^YTlZ^T^ZLffn^Xo\>cvmc^^yrno&^^ 
since most printer users are not f ^J^J^ *f ^ficX is likel? to exceed the price 
This is even more true if the cost of the physical moomcau 
of a legitemate i^giade. 

rity. 

o -» Implementation Constraints 

2.3 IMPLEWtw iM ^ , i„ Section 2 1 must also meet certain implemen- 

Any solution to the requirements detailed m Section Z.l muw 
tation constraints. These are: 

• No flash memory inside SoPEC 

• SoPEC must be simple to verify 

. Silverbrook program code must be updateable 

• OEM program code must be updateable 

- Must be bootable from activity on USB or ISl 
. Noextn^pinsforassi^B^s-^^^^^^ 

. Cannottrustthecommschanne oAeQA p ^^^^^^^^^ 
. Cannot trust the comms channel to the QA Chip in me I 

. Cannot tnist the ISI comms channel 
These constraints are detailed below. 

2.3.1 NO flash memory inside ^^^^^ memory will not be 

SoPEC is intended to be »'nP>«^'"*°il*'V;: ' ^ ^^idered. Although Vinige have a process 



few bits 

2.3.2 SoPEC must be simple to verify 

All combinatorial logic and -'^^'^^tSS ^l^h^^^^^^^^^^^^^ 
before manufacture. Every increase m complexity m euner 

effort and increases risk. 

. verified completely (see Section 2.3. 1 ) 
. correctforallpossiblefutureusesofSoPECsystems 

. finished in time for SoPEC manufactoire 
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Therefore the complete SUverbrook program code must not permanently ^^^^J^ 
SoPEC. It must be possible to update the Silverbrook program code as eiAancements to 
functionality are made and bug fixes are applied. 

In the worst case, only new printers would receive the new fif»ctionality or 
the best case, ousting SoPEC users can download new embedded code to enable fancnon- 
ality or bug fixes. Ideally, these same users would be obtainmg these updates fix>m the 
OEM website or equivalent, and not require any interaction with Silverbrook. 

2 3 4 OEM program coiJe must be updateable 

Given that each OEM will be writing specific program code for Printe« th^ h^^ve "Ot yet 
been conceived, it is impossible for all OEM program code to be embedded m SoPEC at 
the ASIC manufacttue stage. 

Since flash memory is not available (see Section 2.3.1) OEMs ^^^^^^"^^^ 
code in on-chip flash. While it is theoretically possible to store ^^M p™8«^.~^ " 
S on SoPEC. this would entaU OEM-specific ASICs which would be prohibitively 
expensive. Therefore OEM program code cannot pcmanen«(y reside on SoPEC. 

Since OEM program code must be dowiiloadable for SoPEC to execute it should there- 
fore be possible to update the OEM program code as enhancements to funcnonality are 
made and bug fixes are applied. 

In the worst case, only new printers would receive the new functionality or bug ^xes. In 
Z best case, existing SoPEC users can download new embedded code to enabte to^on- 
Sity or bug fixes. Ideally, these same users would be obtaiinng these updates from the 
OEM website or equivalent, and not require any interaaion with Silverbrook. 

2.3.5 Must be bootable from activity on USB or ISI 

SoPEC can be placed in sleep mode to save power when printing is '^o<J«I"7«?;.'!^^!ct' 
not preserved in sleep mode. Therefore any program code and data m RAM v^' be 1^- 
However. SoPEC must be capable of being woken up flrom the host when it is time to prmt 
again. 

In the case of a single SoPEC system, the host communicates with SoPEC via USB. 

In the case of a multi-SoPEC system, the host typically commumcates wiA ivteter 
chip (c g the ISI Master could be SoPEC. and the comms is USB), and can send messages 
to otSr slave SoPECs viathe ISI master. The ISI master SoPEC relays these messages to 
the slaves via the ISI. 

Therefore SoPEC must be capable of being woken up by activity on either the USB or on 
the ISI. 

2 3.6 No extra pins to assign IDs to slave SoPECs 

In a single SoPEC system the host only sends data to the single SoPEC. However in a 
multi-SoPEC system, each of the slaves needs to be uniquely identifiable in order to be 
able for the host to send data to the correct slave. 

Since there is no flash on board SoPEC (Section 2.3.1) we are unable to store a slave ID 
(eg 4 bits) in each SoPEC. Moreover, any ROM in each SoPEC will be identical. 
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a design goal of SoPEC is ^ . ^ oins for inter-SoPEC communica- 

features only used in multi-SoPEC systems. We nave l pinj» iui 
tions. and further pins would add to the cost. 

2 3 7 Canno. tmst th. cmm. chmnrt .o OA Chip in th. pHnt« (PR1NTER.QA) 

rely on Ae communication channel being secure, u is po" 
the PRINTER^QA chip or subvert the commumcations channel. 

2 3 8 cannot trust the comms channel to the OA Chip In the ink cartridges (INK_QA) 

not relv on the communication channel to the INK^I^a oemg sc^m^.. v 
^l^'er to Slace the INK.QA chip or subven the commumcanons channel. 

2 3 9 Cannot trust the ISI comms channel 

• c^wr cv«:tpm that has a non-USB connection to 

man-in-the-middle attacks). 
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3 Proposed Solution 

A proposed solution to the requirements of Section 2. can be sununarised as: 

• EadiSoPEChasauniqueid 

• CPU with user/supervisor mode 

• Memory Management Unit 

• SoPEC ISI identification 

3.1 EACH SoPEC HAS A UNIQUE ID 

Each SOPEC needs to contains a unique SoPECJd of " 
TpBCJdis used to form a symmetric key unique to each SoPEC. SoPEC_ulJ^. 

The verification of operating parameters -<^^ -fS^SSl^^^^^^^^ 

cult to determine. Difficult to determine ^^^S^^^^- ^^P^ 

mine the id via software, or by ^^^"^ *f ^""^^^Xc on specific test pins on the 

U is important to note that In the P^P^J ^^^J^Tagr^r^^^ 

3 2 CPU WITH USER/SUPERVISOR MODE 

V8 instruction set). 

Silverbrook (open^ting system) program cbde Will run in supervisor mode, and all OEM 
program code will run in user mode. 

3 3 Memory Management Unit 

^ . TT rMMtn that limits access to regions of 



1 . On IBM's CUl 1 process this chipld is 80 bits. 
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mitted. 



DRAM. 



Access oennission to all the non-valid address space should be trapped, regardless of user 
or SpeS m^. and regardless of the access being read, execute, or wnte. 

• tr. Ml nf the valid non-DRAM address space (for example the PEP 
i;Z:^o reS/tit :c12s only (no supervisor execute acce«; and user mode 
blocks) is super>^or exceotion that certain GPIO and Timer regusters can also be 

^^r^jL'^''^:^^» bi^is. ^ v^'^ 

peripheral block will determine how the access IS restncted. 

The embedded DRAM should start at OxOOOO_0000 to support P^J^^^^/^ to 
dereferencing to be tr^ped. 

With respect to the DRAM and ;ubs>.^^^^^^^^^^^ 

The .oP£C_i. parameter (see section 3^0 Should o^y^ 

and should only be stored and manipulated m a region of memory tnai nas no 



access. 



3 4 SPECIFIC ENTRY POINTS IN O/S 

implementation for tlus depends on the CPU, 

on the LEON processor, the TH^ — ^^^^^ren ^^^^^^ 

and supervisor mode in a controlled Th^ 7^ ^S^or co<l« ^p^^ 

SOT register sets, and calls a specific entry point m the ^'T"/^,"^^ to the caller in 

rnode The TRAP handler dispatches the semce request, and then returns to tne caiier 

user mode, 
updates occur. 

^r^Ai* f A call user mode code. There are a number 
The LEON also allows supervisor mode code to call user raooe 

of ways that this functionality can be implemented. 
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3.5 Boot Procedure 

35 1 Basic proTTiise ^ nx^n* 

Th. i«.»«on i. .0 load Snv.*,»>. ^^S'^c't^, »p.- 

ating parameters. 

we perfonn authentication of program code and data using asynunctric cryptography and 
vWr/io«f using a QA Chip. . . * tmjaxx 

Assuming >ve have already downloaded some data and a 160-bit signature into eDRAM. 
the boot loader needs to perform the followmg tasks: 
. oerformSHA-l on the downloaded data to calculate a digest tocfl/Dig«t 
: "^Z Z^^c decryption on the downloaded signature (160-brts) ustng an 
asymmetricpublickey toob^n«..^n^^^ ^ ^^^^^ 

passed to the downloaded data 

probed and the security is compromised. 

The procedure requires the following data item: 
• bootOkey- an n-bit asymmetric pubUc key 
The procedure also requires the following two functions: 

? SHA-1= a function that performs SHA-1 on a range of memory and returns a 160-b. 
. Spt - a function that performs asymmetric decryption of a message using the 

Ass'IInt^^r ^1 of these are available (e.g. in the boot ROM), boot loader 0 can be 
defined as in the following pseudocode: 



boo«loadorO(da«a« slg) 

localDigesC 4- SHA-1 tdata) 
authorizedDipest - decrypt(stg, bootOKoy) 

if (localOigest = *-^^rirSta!stirt: address// will never to return 
jump to program code at data scaru ««« 



Else 

// program code is unauthorized 
Endlf 



from some hacker in Norway). 
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: f <:^PFr « based on keeping the asymmetric private key 
TherefoK the entire security of SoPEC ''^^'^""J^^l keeping the pro- 

bootOkey secure. 

If a con^ro:nise is discovered, it «ay be ---^^^.^f ^fbc'^^^^ 
value in SoPEC's ROM. since this is only a single raask change, and worn 

ify and characterize. 

Hierarchies of authentication ..verbrook O/S code needs to be 

Given that test progranis. evaluation ^^^'^^^^^^X JX it is not secure to 
written and tested, and OEM program code ^j^^^^^^^^^J^ siIvert>rook O/S. non-O/S. 
have a single authentication of a I^^'^^^^^'^.ZeTs Jgning Silverbrook program 

To^^^L^So^k-^.-^™ 
code. 

code contains the key for authenticating the next. 

^soiethod^lows for any hien^hyof authentication, basedonaroot key ofboo^^^^^^ 

For example, assume that we have the following entities: 

rS^Cci, SiW^b^c.. SoPBC^ '»^-» -^y- "■^'^'^ 
ASICS »KlSoPECO/Spm«MSotl«™o>>Co»Co. MemjB 

etc. customizing the Pnnt Engine tor a givcu „roduct to sell to the 

. OEM. a company that uses a 'Z^^.^^Z^^^cJ^^'^'^^^- 
ca&.vscTS. The OEM would supply the motor control logic, user m 

. Ti^elevelsofauthenticationhieraxchyareasfollows: 

. SoPECCo generates '^-f • -tt'cl^Jo's^i^^^^^ SoPECCo 
the print engine functionality) and the C°y"Co s asymme P 

is „ oper«ing pm«nc»r Mock for a g.v=n OEM • P""' "J"" priv.tt 
^ prtn. =ngin. "™Ew«^t pJTspeea ».g.=s. 
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The OEM can produce as many versions of datasetS as it likes Ce.g. tor . bp 
poses or for updates to drivers etc) 
The relationship is shown below in Figure I. 




ddtaseti 
(supplied to 
ComCo) 



dataseU 



dataseta 
(supplied to 
OEM) 



(latas6t4 



datas^ 
(suppDed to 
end-user) 



Figure 1. Relationship between the datasets 

J • -*4r crvppr itself validates datasetJ via the bootOkey mech- 
Ter^f^eStu^ The validation hierarchy is sho>vn m F.gurc 2. 



SoPECbootroni 

(tnctudes tjootokey public key) 



vafidatk>n via bootOkey 



datasetl: operating syntem 
(inchides ComCo publto koy) 



vdOdatlon via ComCo key 



data8ttt2: operatinfl parms 
Onctudes OEM publte key) 



valtdatton via OEM Key 



dataset4: OEM program code 



Figure 2. Validation hierarchy 
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pri^e keT^compromised. then the OEM program «,de t^ ^^^^^^Z^r^Z^l 
proiie of bootOkey compromises evcDiSiiag up to SoPEC itself, and v,ould reqiare a 
mask ROM change in SoPEC to fix. 

private key paired to bootOkey secure, 
3 5 3 Authenticating operating parameters 

and OEM operating parameters. Both sets of °P^^'"8 J™ ^ pj^ter to 

ment of host O/S drivers etc. 

On PRINTERQ A memory vector Mo contains the upgradable operating paraineters. and 
SI^ecL M^contL any cot^tant (non-upgradable) operating parameters. 

Considering only SUverbrook operating parameters for the moment, there are actually two 

problems^^^^^ and storing the Silverbrook operating parameters, which should be 

authorized only by Silverbrook 

..reading the parameters into SoPEC. whichis ^n '^r'tV^ToK ' 
the data on L PRim^R_QA chip since we don't trust PRINTER^QA. 

The PIUNTER_QA chip therefore contains the following symmetric keys: 

. r -Sc Id key Tte toy is t" «»t SoPEC (!~ S«Mon 3.1). »»i « 

anything. ^ , ^. 

.«Ki«rn Tt i<! onlv used to authenticate the actual upgrades ot the 
Ko is used to solve problem (a). It is only usea ui au .^-ndard upgrade protocol 

ss»pS5?£Q^sr»n.rr»zr.A.«.- 

ing as the ChipS. 

K, . »^ by ..PEC » so^v.^^ 

SS« „,is>« to SoPECs toe, »i. wh.„ the fT, p.8e 
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Note that the procedure for verifying reads of data from PRINTER_QA <>fe^ ""^^JV^ 
ffllverrook's key Ko. This means that precisely the same mechamsm can be used to read 
^iraS^^eWoEM data also stored in PRINTER.QA. Of come thts mustbe done 
by Silverbiook supervisor code so that SoPECJd.key is not revealed. 

If the OEM also requires upgradable parameters, we can add an c^'"' J^^^ J° 
/nSl^^A^e;^ that key'bToEMJcey and has write permissions to the OEM 

part of Mo- 

In this way.K, never needstobe known by anyone except the SoPECandPRINTER_QA. 

Each printing SoPEC in a multi-SoPEC system need access to a ^^^^-^^^^ 
: • ♦iIr;„«r««rJ»te SoPEC id key to validate ink useage and opcratmg parameters. 

«B.key. (multiple &rf'«CJJJlo«)».sii«teI'WWrE'^Q'^- 

ular rate If line syncs arrived faster than the partcular rate the PHI "^""r^ ^r",. 
?uffe?i<i«rl ms would mean that even if the motor speed was hacked to be fast, the 
print will tenninate. 

3 531 OEM assembly-Une test 

stored in Ae PRINTER^QA as described in Section 3.5.3. 

ferent set of operating parameters i.e. a maxmially upgraded Pnnt Engine. 

would be perfonned. 

At first thought, it might be considered that a ^<-^^'^:^^^^^^^^^^,Z^^ 
PRINTER^QA containing upgraded parameters m.ght be « ^olu^^ How^«. 
SOPEC to accept the parameter, as tme fP^"^ -f^^'^J^^plJ TX^TL 

brook machine (e.g. over a net). Neither approaches are good. 
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1 ^.ct^r PRiMTER OA for testing, then we must make use of special 
If there is no special ^^^/^^^^^ q^, or boil. The solution will depend on the 
test programs, or storage on the PRIN i bk^v^a, w 

test requirements of the OEM. 

would not want the OEM to have such a program. 

Likewise, if a te.t progran^ only P^-^JP^atl^f^^nL^l^'^^^ 

not only docs tWs change '^^^Jf ^^^S te ^^^^ test images. TOs may 

before printing) but a service must be ^ every time a test image is 

gets out into the public, the user can only print blank pages. 

If the OEM requires tests that actually prints dots, there are several possibilities: 

OEM test patterns cannot be printed, 
b A version of the O/S that prinU garbage in special places over the test .mage. 
M^SZ Ss the disadvantage that special OEM test patterns cannot be 

c r^rston of the O/S that reads and decrements a DecrementOnly value in 
?I^R QA. If the value before successful decrementing « «on-2ero. then 
^"Prunatfunupgradecap^iU^^^^^ 

PRIMTER QA customization may only need to be i or z. 
Of these solutions, op^on (c) is probably the leajt .^^^^ 

srdij:errxw-^^^^^^^^^ 

m lpfU capability, and power must stay on whxle doing so. 



3.5.4 



Use of a PrintEnglneLlcense id 

Silverbn,ok O/S program code contai^ ^c^^^^^^ 

the subsequent OEM P"S«^ -^/^^^^r^Ur^^^^^^ ^^-^^ 
SoPEC only contains a smgle root key. * « u printer driver for OEM, run 

applications to be run identically physical ^''g^" P"™'' 
on an identically physical Print Engine from OEMj. 

Prt«B.sir..»e«..W code 2 r^ fto M,,). As wi(h .U other opeadng 
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same time as the other various PRINTER_QA customizations are being applied, before 
being shipped to the OEM site. 

In this way. the OEMs can be sure of differentiating then^selves through software flmc 
tionality. 

^ ^ «; AuthenUcation of Ink 

of dots printed for each ink. 

be stored in Mi+ witlun INKJJA, 

xtaVyfisted bv means of PRINTER_QA, a 
Just as the Print Engine operating ^^^^^^^ZnyAth specifically licensed 
given Print Engine license n^y o-^y ^ ^^^^„SnTS set oTink types, colors. 
S^^SrSius:grure jl'^^^^^^^^ -ching against the data in the 

INK_QA. 

SoPEC must be able to authenticate reads fiom the INK.QA. both in terms of ink parame- 

ters as well as ink remaining. 

To authenticate ink a number of steps must be taken: 

• restrict access to dot counts ^otjtmtfr OA 

• authenticateinkusageandinkpa^ametersviaI^^eQAandPRIN^ 

. broadcast ink dot usage to all SoPECs in a multi-SoPEC system 



3.5-5. f 



restrict access to del counts _ ^^^^^^ 

Since the dot counts are access^ via rtte P^^ supervisor 

^^t*':^?:^ oS^'^e^tl^g^^^^^^^^^ Otherwise it might be possible for 
SEMpTog:S;lerclearditcoun' before authenticati^ 

3552 sutHenticatelnH usage an, ink parameters via INK.QA and PRiNTBR.OA 

Xhebasicproblemofa^n^^^^^^^^^^ 

INK QA, the count has been correctly decremented. 
INK^QA. 

Wecam,otwritetheSoP£C.«/_feytotheINieQAfortworeasons: 
. . updating keys is not power-safe 0.e. if power is removed m^d-update. the INK_QA 

could be rendered useless) 
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not know the old SoPEC Jd_key (knowledge of the old Key is req 
change the old key to a new one). 
The proposed solution is to let INieQA have two keys: 

? ^:5«.p./™.*..Tlus«^^^ 

pennissions to the "f^^^^^^ for a given ink 

usage agreement between an OEM and a =»"^^"r j^j^R OA) K, has no write 
as lrimEngineUcense_kBy which is stored as Ko m PRINTER_QA). 

permissions to anything. / „ Kii 

(e.g. in K2), also with no write permissions. 

This means there axe two shared keys, with PRINTER.QA sharing both, and thereby act- 
ing as a bridge between INICQA and SoPEC. 

. a.e/«ALice«e_teyissharedbetweenINK_QAa^^^ 

. SoPrCuf_*«y is shared between SoPEC and PRINTER_QA 

AU SoPEc'ha; to do is do an authenticated re^ ^J f^- S^Sre^'^'ti^^t 
J,re to PRIKTER^QA. let PRINTER.QA ^f^ ^L^d SE^ rdJtey. SoPEC 
PRINTEI^QAto^^™^^^ 

SomINK QA must be valid, and can therefore be trusted. 

once the data ftom INK_QA is known to be an^unt of ^^^f^^^^l 

checked, and the other ink Ucensmg parameters such as 
InkUsageLicenseJd can be checked for validity. 

The actual steps of read authentication as performed by SoPEC are: 



KEYl 4— 1 // 
KEY2 4- 2 



sir^le con.t.nts to specify Which Key to use when signing 



Rpanrrn. ^ PRINTER_QA. random O - ,^,// re-d with keyl: oselnlcLiccnse.key 

SIGSOWC HMAC.SHIV_HR«t«rER I ^SOPSC I 

T Tl^ZJ^^^- o. in. .e„»ining 

If ,MT„,.inkRemaining = expectedlnkRemining) 

// all is ok 
Else 
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// the ink value is not what we wrote, so don't print anything anymore 
Endlf 
Else 

// the data read from INK_QA is not valid and cannot be trusted 
Endlf 



Strictly speaking, we don*t need a nonce (RsOPEc) ^11 the time because Ma (containing 
the ink remaining) should be decrementing between authentications. However we do need 
one to retrieve the initial amount of ink and the other ink parameters (at power up). This is 
why taking a random number from the IVatchDogTimer at the receipt of the first page is 
acceptable. 

In summary, the SoPEC performs the non-authenticated write [5] of ink remaining to the 
INK^QA chip, and then performs an authenticated read of the data via the PRINTER.Q A 
as per the pseudocode above. If the value is authenticated, gnd the INK-.QA ink-remain- 
ing value matches the expected value, the count was correctly decremented and the print- 
ing can continue. 

3.5.5-3 broadcast ink dot usage to all SoPECs in a multi-SoPEC system 

In a multi-SoPEC system, each SoPEC attached to a printhead (4 at most) must broadcast 
its ink usage to all the SoPECs. In this way, each SoPEC will have its own version of the 
expected ink usage. 

In the case of a man-in-the-middle attack, at worst the count in a given SoPEC is only its 
own count (i.e. all broadcasts are turned into 0 ink usage by the man-in-the-middle). 

A single SoPEC performs the update of ink remaining to the INK_QA chip, and then all 
SoPECs perfonn an authenticated read of the data via the qjpropriate PRINTER_QA (the 
PRINTER^QA that contains their matching SoPECJdJcey - remember that multiple 
SoPEC Jdjceys can be stored in a single PRINTER^QA). If the value is authenticated, 
and the INK.QA value matches the expected value, the count was correctly decremented 
and the printing can continue. 

If any of the broadcasts are not received, or have been tampered with, the updated ink 
counts will not match. The only case this docs not cater for is if each SoPEC is tricked (via 
an ISI man-in-the-middle attack), into a total that is the same, yet not the true total. Apart 
from the fact that this is not viable for general pages, at worst this is the maximum amount 
of ink printed by a single SoPEC. We don't care about protecting against this case. 

Since there will be at most 4 printing SoPEC, it requires at most 4 authenticated reads. 
This should be completed within 0.5 seconds - well within the 2 seconds/page print time. 

3.5.6 Example hierarchy 

The exact breakdown of hierarchy will depend on a later investigation, but for the pur- 
poses of scoping out possibilities, it is worthwhile considering an example hierarchy for 
illustrative purposes. 
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Adding an extra bootloader step to the example from Section 2.5.2, we can break up the 
contents of program space into logical sections, as shown in Table 1. Note that the ComCo 
does not provide any program code, merely operating parameters that is used by the O/S. 



Table 1. Sections of Program Space 









0 

(ROM) 


boot loader 0 
SHArl function 
asymmetric decrypt function 
bootOlcey 


sectton 1 via bootOkey 


1 


boot loader 1 
SoPEC_OS„public_key 


section 2 via SoPEC_OS_piiblic_key 


2 


Siiverbrooic O/S program code 
function to generate 
SoPEC.id^key from SoPEC_id 
Basic Print Engine 
ComCo^public.key 


section 3 via ComCo_public_key 

section 4 via OEM_public_key (supplied in sec- 
tion 3) 

PRINTER.QA data, which Includes the 
PrintEngineLteense^kJ. Silverbrook operating 
parameters, and OEM operating parameters (all 
authenttoated via SoPEC.id^key) 


3 


ComCo license agreement operat- 
ing parameter ranges, {nduding 
PrintEngineUcense.ld (gets 
loaded into supervisor mode sec- 
tion of memory) 

OEM^public.key (gets loaded into 
supenidsor mode sectk>n of mem- 
ory) 

Any ComCo written user-mode 
program code (gets loaded into 
mode mode section of memory) 


Is used by section 2 to verify section 4 and 
range of parameters as found in PRINTER^QA 


4 


OEM specific program code 


OEM operating parameters via calls to Silver- 
brook O/S code 



The verification procedures will be required each time the CPU is woken up, since the 
RAM is not preserved. 



3.5.7 What if the CPU (S not fast enough? 

In the example of Section 3.5.6. every time the CPU is woken up to print a document it 
needs to perform: 

• SHA-l on all program code and program data 

• 4 sets of asymmetric decryption to load the program code and data 

• 1 HMAC-SHAl generation per 512-bits of Silverbrook and OEM printer and ink oper- 
ating parameters 

Although the SHA-1 and HMAC process will be fast enough on the embedded CPU (the 
program code will be executing from ROM), it may be that the asymmetric decryption 
will be slow. And this becomes more likely with each extra level of authentication. If this 
is the case (as is likely), hardware acceleration is required. 

A cheap form of hardware acceleration takes advantage of the fact that in most cases the 
same program is loaded each time, with the first time likely to be at power-up. The hard- 
ware acceleration is simply data storage for the authorizedDigest which means that the 
boot procedure now is: 
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•loivCF1l,.^oot loadarO (data « siff ) 

localDigest SHA-l<data) 

Z£ (localDigest = previouslyStoredAuthorizedDigeat) 

jun^ to program code at data-start address// will never to return 
Blae 

author IzedDlgest <- decrypt (sig. bootOkey) 

If (localDigest « authorizedDigest) 

previouslyStoredAuthorizedDigest <- authorizedDigest 

jump to program code at data-start address// will never to return 

Else 

// program code is unauthorized 
Bndlf 



This procedure means that a reboot of the same authorized program code will only require 
SHA-1 processing. At power-up, or if new program code is loaded {e,g. an upgrade of a 
driver over the internet), then the full authonzadon via asymmetric deciyption takes place. 
This is because the stored digest will not match at power-up and whenever a new program 
is loaded. 

The question is how much preserved space is required. 

Each digest requires 160 bits (20 bytes), and this is constant regardless of the asymmetric 
encryption scheme or the key length. While it is possible to reduce this number of bits, 
thereby sacrificing security, the cost is small enough to warrant keeping the fiill digest. 

However each level of boot loader requires its own digest to be preserved. This gives a 
maximum of 20 b>^es per loader. Digests for operating parameters and ink levels may also 
be preserved in the same way, although these authentications should be fast enough not to 
require cached storage. 

Assuming SoPEC provides for 12 digests (to be generous), this is a total of 240 bytes. 
These 240 bytes could easily be stored as 60 x 32-bit registers, or probably more conven- 
iently as a small amount of RAM (eg 0.25 - 1 Kbyte). Providing something like 1 Kbyte of 
RAM has the advantage of allowing the CPU to store other useful data, although this is not 
a requirement. 

In general, it is useful for the boot ROM to know whether it is being started up due to 
power-on reset or activity on the USB/ISl. In the former case, it can ignore the previously 
stored values (either 0 for registers or garbage for RAM). In the latter case, it can use the 
previously stored values. Even without this, a startup value of 0 (or garbage) means the 
digest won*t match and therefore the authentication will occur unplictly. 

3.6 SoPEC ISI IDENTIFICATION 

At power-up, the host can send targeted data to the USB-connected SoPEC, but can only 
send broadcasts to all of the slave SoPECs via the USB-cormected SoPEC's ISL 

Each slave SoPEC will verify the broadcast message received over the ISI, and if it is 
valid, will execute it. Several levels of authorization may occur. However, at some stage, 
this common program code (broadcast to all of the slave SoPECs and signed by the appro- 
priate asynunetric private key) will, among other things, set the slave SoPECs ISI id If 
there is only I slave, the id is given, but if there is more than 1 slave, the id must be deter- 
mined in some fashion. 
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On a particular physical arrangement of SoPECs each slave SoPEC will have a different 
set of connections on GPIOs. For example, one SoPEC maybe in charge of motor control, 
while another may be driving the LEDs etc. The unused GPIO pins (not necessarily the 
same on each SoPEC) can be set as inputs and then tied to 0 or 1. As long as the connec- 
tion settings are mutually exclusive, program code can determine which is which, and the 
id appropriately set. 

In some multi-SoPEC systems, a given SoPEC will only be attached to a single printhead 
(left or right). We can conveniently use the second printhead connection pins (tempciature 
and test) to fonn an [SI id 

This scheme of slave SoPEC identification does not introduce a security breach. If an 
attacker rewires the pinouts to confuse identification, at best it will simply cause strange 
printouts (e.g. swapping of printout data) to occur, while at worst the Print Engine will 
simply not function. 

Note that some physical setting (e.g. pins) on each of the multiple SoPECs is r^uired - the 
settings just need to be mutually exclusive. Although it is possible for all the SoPECs to 
come to a logical ISI id assignment (e.g. by using ethemet-like protocols), the ISI id needs 
to be very much & physical identity scheme. This is because these SoPECs are not simply 
logical processors - we want the correct portion of the page to be printed on the correct 
physical location, motor controls will be physically connected to a specific physical 
SoPEC etc. 

3.7 Setting up QA Chip keys 

In use, each INK_QA chip needs the following keys: 

• = Siq>plyInkLicenseJcey 

• Kj = UseInkLicense_key 

Each PRINTER_QA chip tied to a specific SoPEC requires the following keys: 

• Ko = PrintEngineLicenseJcey 

• Y.^^ SoPEC Jdjcey 

• K2 = UseInkLicense_key 

Note that there may be more than one K| depending on the number of PRINTER_QA 
chips and SoPECs in a system. These keys need to be appropriately set up in the QA Chips 
before they will function correctly together. 

3.7.1 Original QA Chips as received by a ComCo 

When original QA Chips are shipped from QACo to a specific ComCo their keys arc as 
follows: 

• Kq = QACojComCo^KeyO 

• ¥.x = QACojComCo_Keyl 

• ^2=" QACojComCoJCeyl 

• y^z=^QACojComCo_KeyS 

All 4 keys are only known to QACo, Note that these keys are different for each QA Chip. 

3.7.2 Steps at the ComCo 

The ComCo is responsible for making Print Engines out of Memjet printheads, QA Chips 
PECs or SoPECs, PCBs etc. 
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In addition, the ComCo must customize the INK_QA chips and PRJNTER.QA chip 
on-board the print engine before shipping to the OEM. 



There arc two stages: 

• replacing the keyi 
and PRINTER^Q 

• setting operating parameters as per the license with the OEM 



• replacing the keys in QA Chips with specific keys for the application (i.e. INK OA 
and PRINTER^QA) " 



3.7. Zi Replacing keys 

The ComCo is issued QID hardware [4] by QACo that allows programming of the various 
keys (except for K,) in a given QA Chip to the final values, follovmig the standard 
CliipF/ChipP replace key (indirect version) protocol [5]. The indirect version of the proto- 
col allows each QACo_ComCo_Key to be different for each SoPEC. 

In the case of programming of PRINTER^QA's K, to be SoPEC Jdjcey, there is the addi- 
tional step of transferring an asymmetrically encrypted SoPEC Jd Jcey (by the public-key) 
along with the nonce (Rp) used in the replace key protocol to the device that is functioning 
as a ChipF. The ClhipF must decrypt the SoPEC Jd Jcey so it can generate the standard 
replace key message for PRINTER^QA (functioning as a ChipP in the ChipF/ChipP pro- 
tocol). The asymmetric key pair held in the CWpF equivalent^hould be unique to a 
C:omCo (but still known only by QACo) to prevent damage in the case of a compromise. 

Note that the various keys installed in the QA Chips (both INK^QA and PRINTER^QA) 
are only known to the QACo. .The OEM only uses QIDs and QACo supplied ChipFs. The 
replace key protocol [5] allows the programming to occur without compromising the old 
or new key. 



3. 7. 2. 2 Setting operating parameters 

There are two sets of operating parameters stored in PRINTER_QA and INK.Q A: 

• fixed 

• upgradable 

The fixed operating parameters can be written to by means of a non-authenticated writes 
[51 to Mi^. via a QID [4], and permission bits set such that they arc ReadOnly, 

The upgradable operating parameters can only be written to after the QA Chips have been 
programmed with the correct keys as per Section 3.7.2. L Once they contain the correct 
keys they can be programmed with appropriate operating parameters by means of a QID 
and an appropriate ChipS (containing matching keys). 
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3 Introduction 



This document describes the SoPEC ASIC (Small office home office Print Engine Controller) suitable for 
use m pnce sensitive SoHo printer products. The SoPEC ASIC is intended to be a low cost solution for bi^ 
hthic pnnthead control, replacing the multichip solutions in larger more professional systems with a single 
chip. The increased cost competitiveness is achieved by integrating several systems such as a modified 
PECl [1] printing pipeline, CPU control system, peripherals and memory sub-system onto one SoC ASIC 
reducing component count and simplifying board design. * 

Th^ section will give a general introduction to Memjet printing systems, introduce the components that 
make a bi-Iithic pnnthead system, describe possible system architectures and show how several SoPECs 
can be used to achieve A3 and A4 duplex printing. The section "SoPEC ASIC" describes the SoC SoPEC 
ASIC, with subsections describing the CPU. DRAM and Print Engine Pipeline subsystems. Each section 
gives a detailed description of the blocks used and their operation within the overall print system. The final 
section describes the bi-lithic printhead construction and associated implications to the system due to its 
makeup. 

Some sections of this document were derived from the Print Engine Controller Hardware Design Specifi- 
cation[ 1 ] written by Siiverfjrook Research. 
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4 Nomenclature 



4.1 Bl-LITHIC PRINTHEAD NOTATION 



4.2 



A bi-Iithic based printhead is constmcted from 2 printhead ICs of varying sizes. The notation M:N is used 
to express the size relationship of each IC. where M specifies one printhead IC in inches and N specifies 
the remaming pnnthead IC in inches. 

Section 35 Memjet Printhead contains a description of the bi-lithic printhead and related tenninology. 



Definitions 

The following terms 
Bi-lithic printhead 
CPU 

ISI-Bridge chip 



ISIMaster 

ISISlave 

LEON 

LineSyncMaster 

Multi-SoPEC 

Netpage 

PECl 

Printhead IC 
PrintMaster 

QA Chip 
Storage SoPEC 
Tag 



are used throughout this specification: 
Refers to printhead constructed from 2 printhead ICs 
Refers to CPU core, caching system and MMU. 

A device with a high speed interface (such as USB2.0. Ethernet or IEEE1394) and 
one or more ISI interfaces. The ISI-Bridge would be the ISIMaster for each of die 
ISI buses it interfaces to. 

The ISIMaster is the only device allowed to initiate communication on the Inter 
Sopec Interface (ISI) bus. The ISIMaster interfaces directly with the host. 
Multi-SoPEC systems wiU contain one or more ISISlave SoPECs connected to the 
ISI bus. ISISlaves can only respond to communication initiated by the ISIMaster. 
Refers to the LEON CPU core. 

The LineSyncMaster device generates the line synchronisation pulse that all 
SoPECs in the system must synchronise their line outputs to. 

Refers to SoPEC based print system with multiple SoPEC devices 

Refers to page printed with tags (normally in infrared ink). 

Refers to Print Engine Controller version 1, precursor to SoPEC used to control 
printheads constructed from multiple angled printhead segments. 
Single MEMS IC used to construct bi-Hthic printhead 

The PrintMaster device is responsible for coordinating all aspects of the print 
operation. There may only be one PrintMaster in a system. 
Quality Assurance Chip 

An ISISlave SoPEC used as a DRAM store and which does not print. 

Refers to pattern which encodes infonnation about its position and orientation which 
allow it to be optically located and its data contents read. 



4.3 



Acronym and Abbreviations 

The following acronyms and abbreviations are used in this specification 
CPU Contone FIFO Unit 

CPU Central Processing Unit 

DRAM Interface Unit 
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DNC 




DRAM 




DWU 


DotLine Writer Unit 


GPIO 


Oenera.! Puroose Innut Outniit 


HCU 




ICU 


f nt^mmf' i^nt%tri\\l^r f Tnit 
iuiCiiu|ji VrfUiiu unci unii 


ISI 




LDB 


f ^ccIpcc Ri>1p>i/^I V^t^r^A^T 

A^VaaiwdO Ol^lCVd l^CVUUCi 


LLU 


Line Loadpr t Init 


LSS 


IX) w opeea ocnsi interlace 


MEMS 


iviicro ciecTTO iwecnamcsi oystem 


MMU 


Memory Mdnagement Unit 


PCU 


oorcrU' i^oniroiier unit 


rni 


rnniMeaa interlace 


PQQ 


Power Save Storage Unit 


DFM T 

KUU 


Real-tune Debug Unit 


KUM 


Read Only Memory 


SCB 


Serial Communication Block 


SFU 


Spot FIFO Unit 


SMG4 


Silveibrook Modified Group 4. 


SoPEC 


Small office home office Print Engine Controller 


SRAM 


Static Random Access Memory 


TE 


Tag Encoder 


TFU 


Tag FIFO Unit 


TIM 


Timers Unit 


USB 


Universal Serial Bus 



4.4 Pseudocode notation 

In general the pseudocode examples use C like statements with some exceptions. 
Symbol and naming convectioEis used for pseudocode. 

Comment 
- Assignment 

Operator equal, not equal, less than, greater than 
+»-,*/ ,% Operator addition, subtraction, multiply, divide, modulus 

Bitwise AND, bitwise OR, bitwise exclusive OR, left shift, right shift, complement 
AND,OR.NOT Logical AND, Logical OR, Logical inversion 
[XX:YY] Array/vector specifier 
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i3 



{a, b, c} 



Concatenation operation 
Increment and decrement 



4.4.1 Regfster and signal naming conventions 

In general register naming uses the C style conventions with capitalization to denote word delimiters Sig- 
nals use RTL style notation where underscore denote word delimiters. There is a direct translation between 
both conventioa For example the CmdSourceFifo register is equivalent to cmd^sourcejifo signal. 

4.5 State machine notation 

state machines should be described using the pseudocode notation outlined above. State machine descrip- 
tions use the convention of mxisdiufi to indicate the cause of a transition from one state to another and 
plain text (no underlme) to indicate the effect of the transition i.e. signal transitions which occur when the 
new state is entered 

A sample state machine is shown in Figure 1. 

fesat^OiHfsf no«n 
cdMjdiu_rreq « 0 
ignore.daUi e o 



i 



cdu_€fUj_rreq = 



odu_diu.rreq ^ 1 
Ignore.data s o 



( 


Reset 


) 






( 


Idle 


> 



dona faande»n 
cdu_dlu_rreq « 0 
fgnof e.data = 0 



Figure 1. Example State machine notation 
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5 Printing Considerations 

A bi-lithic piinthcad produces 1600 dpi bi-level dots. On low-diffusion paper, each ejected drop forms a 
22.5jim diameter dot. Dots are easily produced in isolation, allowing dispersed-dot dithering to be 
exploited to its fullest. Since the bi-lithic printhead is the width of the page and operates with a constant 
paper velocity, color planes are printed in perfect registration, allowing ideal dot-on-dot printing. Dot-on- 
dot printing minimizes 'muddying' of midtones caused by inter-color bleed, 

A page layout may contain a mixture of images, graphics and text. Continuous-tone (contone) images and 
graphics are reproduced using a stochastic dispersed-dot dither. Unlike a clustered-dot (or amplitude-mod- 
ulated) dither, a dispersed-dot (or frequency-modulated) dither reproduces high spatial frequencies (i.e. 
image detaQ) ahnost to the limits of the dot resolution, while simultaneously reproducing lower spatial fre- 
quencies to their full color depth, when spatially integrated by the eye. A stochastic dither matrix is care- 
fully designed to be free of objectionable low-frequency patterns when tiled across the image. As such its 
size typically exceeds the minimum size required to support a particular number of intensity levels (e.g. 
1 6x1 6x 8 bits for 257 intensity levels). 

Human contrast sensitivity peaks at a spatial frequency of about 3 cycles per degree of visual field and 
then falls oflF logarithmically, decreasing by a factor of 100 beyond about 40 cycles per degree and becom- 
ing immeasurable beyond 60 cycles per degree [21][22]. At a nonnal viewing distance of 12 inches (about 
300mm), this translates roughly to 200-300 cycles per inch (cpi) on the printed page, or 400-600 samples 
per inch according to Nyquist's theorem. 

In practice, contone resolution above about 300 ppi is of limited utility outside special applications such as 
medical imaging. Offset printing of magazines, for example, uses contone resolutions in the range 150 to 
300 ppi. Higher resolutions contribute slightly to color error through the dither. 

Black text and graphics are reproduced directly using bi-level black dots, and arc therefore not anti-aliased 
(i.e. low-pass filtered) before being printed. Text should therefore be supersampled beyond the perceptual 
limits discussed above, to produce smoother edges when spatially iiitcgrated by the eye. Text resolution up 
to about 1200 <^i continues to contribute to perceived text sharpness (assuming low-diffusion paper, of 
course), 

A Netpage printer, for example, may use a contone resolution of 267 ppi (i.e. 1600 dpi / 6), and a black 
text and graphics resolution of 800 dpi. A high end office or departmental printer may use a contone reso- 
lution of 320 ppi (1600 dpi / 5) and a black text and graphics resolution of 1600 dpi. Both formats are 
capable of exceeding the quality of commercial (offset) printing and photogn^hic reproduction. 
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6 Document Data Flow 



6.1 CONSfDERATIONS 

Because of the page-width nature of the bi-lithic printhead, each page must be printed at a constant speed 
to avoid creating visible artifacts. This means that the printing speed can't be varied to match the input 
data rate. Document rasterization and document printing are therefore decoupled to ensure the printhead 
has a constant supply of data. A page is never printed until it is fully rasterized. This can be achieved by 
storing a compressed version of each rasterized page image in memory. 

This decoupling also allows the RIP(s) to run ahead of the printer when rasterizing simple pages, buying 
time to rasterize more complex pages. 

Because contone color images are reproduced by stochastic dithering, but black text and line gr^hics are 
reproduced directly using dots, the compressed page image format contains a separate foreground bi-Ievel 
black layer and background contone color layet The black layer is con^osited over Ae contone layer after 
the contone layer is dithered (although the contone layer has an optional black component). A final layer 
of Netpage tags (in infrared or black ink) is optionally added to the page for printout. 

Figure 2 shows the flow of a document from computer system to printed page. 




10 

Figure 2. Document data fTow 
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At 267 ppi for example, a A4 page (8.26 inches x 1 L7 inches) of contone CNfYK data has a size of 
26.3MB. At 320 ppi, an A4 page of contone data has a size of 37.8MB. Using lossy contone compression 
algorithms such as JPEG [23], contone images compress with a latib up to 10:1 without noticeable loss of 
quality, giving compressed page sizes of 2.63MB at 267 ppi and 3.78 MB at 320 ppi. 

At 800 dpi, a A4 page of bi-level data has a size of 7.4MB. At 1600 dpi. a Letter page of bi-level data has 
a size of 29.5 MB. Coherent data such as text compresses very well. Using lossless bi-level compression 
I algorithms such as SMG4 fax as discussed in Section 8.1.2.3.1, ten-point plain text compresses with a 

ratio of about 50: 1 . Lossless bi-level compression across an average page is about 20: 1 with 1 0: 1 possible 
for pages which compress poorly. The requirement for SoPEC is to be able to print text at 10:1 compres- 
sion. Assuming 10:1 compression gives compressed page sizes of 0.74 MB at 800 dpi. and 2 95 MB at 
1600 dpi. 

Once dithered, a page of CMYK contone image data consists of 1 16MB of bi-level data. Using lossless bi- 
level compression algorithms on this data is pointless precisely because the optimal dither is stochastic - 
i.e. since it introduces hazd-to-compress disorder. 

Netpage tag data is optionally supplied with the page image. Rather than storing a compressed bi-leve! 
data layer for the Netpage tags, the tag data is stored in its raw form. Each tag is supplied up to 1 20 bits of 
raw variable data (combined with up to 56 bits of raw fixed data) and covers up to a 6mm x 6mm area (at 
1600 dpi). The absolute maxim\im number of tags on a A4 page is 15,540 when the tag is only 2mm x 
2mm (each tag is 126 dots x 126 dots, for a total coverage of 148 tags x 105 tags). 15,540 tags of 128 bits 
per tag gives a compressed tag page size of 0.24 MB. 

The multi-layer compressed page image format therefore exploits the relative strengths of lossy JPEG con- 
tone image compression, lossless bi-level text compression, and tag encoding. The format is compact 
enough to be storage-efficient, and simple enough to allow straightforward real-time expansion during 
printing. 

Since text and images normally don't overiap, the normal worst-case page image size is image only, while 
the normal best-case page image size is text only. The addition of worst case Netpage tags adds 0.24MB to 
the page image size. The worst-case page image size is text over image plus tags. The average page size 
assumes a quarter of an average page contains images. Table 1 shows data sizes for compressed Letter 
page for these different options. 



Table 1 . Data sizes for A4 page (8.26 Inches x 11.7 Inches) 



m mm i-^-^d ^- ^^ihMu 






Imaga only (contone), 10:1 compression 


2.63 MB 


3.78 MB 


Text only (W-Ievel). 10:1 compression 


0.74 MB 


2.95 MB 


Netpage tags. 1600 dpi 


0.24 MB 


0.24 MB 


Worst case (text + Image + lags) 


3.61 MB 


6.67 MB 


Average (text -i- 25% image + tags) 


1.64 MB 


4.25 MB 



6.2 Document Data Flow 

The Host PC rasterizes and compresses the incoming document on a page by page basis. The page is 
restructured into bands with one or more bands used to construct a page. The compressed data is then 
transferred to the SoPEC device via the USB link. A complete band is stored in SoPEC embedded mem- 
ory. Otice the band transfer is complete the SoPEC device reads the compressed data, expands the band, 
normalizes contone, bi-level and tag data to 1600 dpi and transfers the resultant calculated dots to the bi- 
Hthic printhead. 

The document data flow is 
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• The RIP software rasterizes each page description and compress the rasterized page image. 

• d^Mi^^ ^^""'^ optionally contains encoded Netpage [5] tags at a programmable 

• The compressed page image is transferred to the SoPEC device via the USB noimaUy on a band bv 
band basis. ^ ^ »j 

• The print engine takes the compressed page image and starts the page expansion. 

• The first stage page expansion consists of 3 operations perfonned in parallel 

• expansion of the JPEG-compressed contone layer 

• expansion of the SMG4 fax compressed bi-level layer 

• encoding and rendering ofthe bi-level tag data. 

• The second stage dithers the contone layer using a programmable dither matrix, producing up to four 
bi-level layers at fuU-resolution, ^ 

• The second stage then composites the bi-level tag data layer, the bi-Ievcl SMG4 fax de-compressed 
layer and up to four bi-level JPEG de-compressed layers into the full-resolution page image. 

• A fixative layer is also generated as required. 

• The last stage fonnats and prints the bi-level data through the bi-lithic printhead via the printhead inter- 
iace« 

The SoPEC device can print a full resolution page with 6 color planes. Each of the color planes can be 
generated from compressed data through any channel (either JPEG compressed, bi-level SMG4 fex com- 
pressed, tag data generated, or fixative channel created) with a maximum number of 6 data channels from 
page RIP to bi-hthic printhead color planes. 

The mapping of dato channels to color planes is programmable, this allows for multiple color planes in the 
pnnthead to map to the same data channel to provide for redundancy in the printhead to assist dead nozzle 
compensatioiL 

Also a data channel could be used to gate data from another data channel. For example in stencil mode, 
data fromthe bilevel data channel at 1600 dpi can be used to filter the contone data channel at 320 dpi jriv- 
mg the effect of 1600 <^i contone image. 

6.3 Page considerations due to SoPEC 

^^J^F^P typically stores a complete page of document data on chip. The amount of storage 

available for compressed pages is limited to 2\fbytes. imposing a fixed maximum on compressed page 
sue. A comparison ofthe compressed image sizes in Table I indicates that SoPEC would not be capable 
of printing worst case pages unless they are split into bands and printing commences before aU the bands 
for the page have been downloaded. The page sizes in the table are shown for comparison purposes and 
would be considered reasonable for a professioiial level printing system. The SoPEC device is aimed at the 
f o'^f would not be required to print pages of that complexity. Target document types for the 

SoPEC device are shown Table 2. 



Table 2. Page content targets for SoPEC 
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Table 2. Page content targets for SoPEC 

















Mixed Graphics and Text 

- \tnage of 6 inches x 4 inches O 267 ppi and 3 cotore 

- Remaining area text --73 inches^, 800 dpj 


6x4x267x267x3 O 5:1 
800x800x73 ^ 10:1 


1.55 


Best Case Photo, 3 Corors, 6.6 Megapixel Image 


6.6 Mpixel 9 10:1 


2.00 



If a document with more complex pages is required, the page RIP software in the host PC can deteimine 
that there is insufficient memory storage in the SoPEC for that document In such cases the RIP software 
can take two courses of action. It can increase the compression ratio until the compressed page size will fit 
in the SoPEC device, at the expense of document quality, or divide the page into bands and allow SoPEC 
to begin printing a page band before all bands for that page are downloaded. Once SoPEC starts printing a 
page it cannot stop, if SoPEC consumes compressed data faster than the bands can be downloaded a buffer 
undemin error could occur causing the print to fail. A buffer underrun occurs if line synchronisation pulse 
is received before a line of data has been transferred to the printhead. 

Other options which can be considered if the page does not fit completely into the compressed page store 
are to slow the printing or to use multiple SoPECs to print parts of the page. A Storage SoPEC (Section 
7.2.5) could be added to the system to provide guaranteed bandwidth data delivery. The print system could 
also be constructed using an ISI-Bridge chip (Section 7.2.6) to provide guaranteed data delivery. 



Doc: SoPEC_harclware_clesign 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 17 



SoPEC : Hardware Design 



7 Memjet Printer Architecture 

The SoPEC device can be used in several printer configurations and architectures. 

In the general sense evexy SoPEC based printer architecture will contain: 

• One or more SoPEC devices. 

• One or more bi-lithic printheads. 

• Two or more LSS busses. 

• Two or more Q A chips. 

• USB 1 . 1 connection to host or IS! connection to Bridge Chip. 

• ISI bus connection between SoPECs (when multiple SoPECs are used). 

Some example printer configurations as outlined in Section 7.2. The various system components are out- 
lined briefly in Section 7. 1 . 

7.1 System Components 

7.1.1 SoPEC Print Engine Controller 

The SoPEC device contains several. system on a chip (SoC) components, as well as the print engine pipe- 
line control application specific logic. 

7. i. 1. 1 Print Engine Pipeline (PEP) Logic 

The PEP reads compressed page store data from the embedded memory, optionally decompresses the data 
and formats it for sending to the printhead. The print engine pipeline functionality includes expanding the 
page image, dithering the contone layer, compositing the black layer over the contone layer, rendering of 
Netpagc tags, compensation for dead nozzles in the printhead, and sending the resultant image to the bi- 
lithic printhead. 

7. f . t. 2 Embedded CPU 

SoPEC contains an embedded CPU for general purpose system configuration and management. The CPU 
performs page and band header processing, motor control and sensor monitoring (via the GPIO) and other 
system control functions. The CPU can perform buffer management or report buffer status to the host. The 
CPU can optionally run vendor application specific code for general print control such as p^er ready 
monitoring and LED status update. 

7. f . 1,3 Embedded Memory Buffer 

A 2.5Mbyte embedded memory buffer is integrated onto the SoPEC device, of which q)proxiniately 
2Mbytes are available for compressed page store data. A compressed page is divided into one or more 
bands, with a number of bands stored in memory. As a band of the page is consumed by the PEP for print- 
ing a new band can be downloaded. The new band may be for the current page or the next page. 

Using banding it is possible to begin printing a page before the complete compressed page is downloaded, 
but care must be taken to ensure that data is always available for printing or a buffer underrun may occur. 

An Storage SoPEC acting as a memory buffer (Section 7.2.5) or an ISI-Bridge chip with attached DRAM 
(Section 7.2.6) could be used to provide guaranteed data delivery. 
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7. 1. 1,4 Embedded USB 1. 1 Device 



The embedded USB l.l device accepts compressed page data and control commands from the host PC, 
and facilitates the data transfer to either embedded memoiy or to another SoPEC device in multi-SoPEC 
systems. 



7«1.2 Bi-lithic Pdnthead 



The pnnthead is constructed by abutting 2 printhead ICs together. The printhead ICs can vary in size from 
2 mches to 8 inches, so to produce an A4 printhead several combinations are possible. For example two 
printhead ICs of 7 inches and 3 inches could be used to create a A4 printhead (the notation is 7:3). Simi- 
larly 6 and 4 combination (6:4), or 5:5 combination. For an A3 printhead it can be constmcted from 8:6 or 
an 7:7 printhead IC combination. For photographic printing smaller printheads can be constmcted. 



7.1.3 LSS interface bus 



Each SoPEC device has 2 LSS system buses for communication with QA devices for system authentica- 
don and ink usage accounting. The number of QA devices per bus and their position in the system is unre- 
stncted with the exception that PRINTER_QA and INK^ devices should be on separate LSS busses 



7.1.4 QA devices 



Each SoPEC system can have several QA devices. Normally each printing SoPEC will have an associated 
PRINTER^QA. Ink cartridges will contain an INKJ2^ chip. PRINTER _QA and INKjQA devices should 
be on separate LSS busses. All QA chips in the system are physically identical with flash memory contents 
defining PRINTERjQA from INK^A chip. 

7.1.5 ISI internee 

The Inter-SoPEC toerface {ISl) provides a communication channel between SoPECs in a multi-SoPEC 
system. The ISIMaster can be SoPEC device or an ISI-Bridge chip depending on the printer configuration. 
Both compressed data and control conunands are transferred via the interface. 

7.1.6 ISI-Bridge Chip 

A device, other than a SoPEC with a USB connection, which provides print data to a number of slave 
SoPECs. A bridge chip will typicaUy have a high bandwidth connection, such as USB2.0. Ethernet or 
IEEE1394, to a host and may have an attached external DRAM for compressed page storage. A bridge 
chip would have one or more ISI interfaces. The use of multiple ISI buses would allow the construction of 
mdependent pnnt systems within the one printer. The ISI-Bridge would be the ISIMaster for each of the 
ISI buses it interfaces to. 

7.2 Possible SoPEC Systems 

Several possible SoPEC based system architectures exist. The following sections outline some possible 
architectures. It is possible to have extra SoPEC devices in the system used for DRAM storage. The QA 
chip configurations shown are indicative of the flexibility of LSS bus architecture, but not limited to those 
configurations. 
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7.2.1 A4 Simplex with 1 SoPEC device 



USB from Host ^ 




highspeed 
•4-^ low speed 



prfnthead assembly 

*■ — — — — <— — «-• — — — — — — 

Figure 3. Single SoPEC A4 Simplex system 

In Figure 3, a single SoPEC device can be used to control two printhead ICs. The SoPEC receives com- 
pressed data through the USB device from the host. The compressed data is processed and transferred to 
the printhead. 



7.2.2 A4 Duplex with 2 SoPEC devices 



USB from Host 




highspeed 
low speed 



printhead assembly 



Figure 4. Dual SoPEC A4 Duplex system 

In Figure 4, two SoPEC devices are used to control two bi-lithic printheads, each with two printhead ICs. 
Each bi-lithic printhead prints to opposite sides of the same page to achieve duplex printing. The SoPEC 
connected to the USB is the ISIMaster SoPEC, the remaining SoPEC is an ISISlave. The ISIMaster 
receives all the compressed page data for both SoPECs and re-distributes the compressed data over the 
Intcr-SoPEC Interface (ISI) bus. 



Doc: SoPEC_hardware_deslgn 

Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 20 



SoPEC : Hardware Design 



ft may not be possible to print an A4 page every 2 seconds in this configuration since the USB 1.1 connec- 
tion to the host may not have enough bandwidth. An alternative would be for each SoPEC to have its own 
USB 1.1 connection. This would allow a faster average print speed. 



7.2.3 A3 Simplex with 2 SoPEC devices 



USB from Host i 




^ high 

lowsp^ed 



^ Figure 5. Dual SoPEC A3 simplex system 

In Figure 5, two SoPEC devices are used to control one A3 bi-lithic printhead. Each SoPEC controls only 
one printhead IC (the remaining PHI port typically remains idle). The USB 1.1 connection defines the ISI- 
Niaster SoPEC. In this dual SoPEC configuration the compressed page store data is split across 2 SoPECs 
giving a total of 4Mbyte page store, this allows the system to use compression rates as in an A4 architec- 
ture, but with the increased page size of A3. The ISIMaster receives all the compressed page data for all 
SoPECs and re-distributes the compressed data over the Inter-SoPEC Interface (ISI) bus. 
It may not be possible to print an A3 page every 2 seconds in this configuration since the USB 1.1 connec- 
tion to the host will only have enough bandwidth to supply 2Mbytes every 2 seconds. Pages which require 
more than 2MBytes every 2 seconds will therefore print slower. An alternative would be for each SoPEC 
to have its own USB 1 . 1 connection. This would allow a faster average print speed 
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7.2.4 A3 Duplex with 4 SoPEC devices 



replaceabit 
ink caitifdge 



I Ink cartridge 





it 


Ink cartridge! II 


ink cartjidgel ■ 


QAchip |,, 


QAcMp 1 , 



highspeed 
k>w speed 




. ^nthead assembly. 



Figure 6. Quad SoPEC A3 duplex system 



In Figure 6 a 4 SoPEC system is shown. It contains 2 A3 bi-lithic printheads. one for each side of an A3 
page. Each printhead contain 2 printhead ICs, each printhead IC is controlled by an independent SoPEC 
device, with the remaining PHI port typically unused. Again the USB 1 . 1 connection defines the ISIMaster 
with the other SoPECs as ISISlaves. In total, the system contains SMbytes of compressed page store 
(2Mbytes per SoPEC), so the increased page size does not degrade the system print quality, from that of an 
A4 simplex printer. The ISIMaster receives all the compressed page data for all SoPECs and re-distributes 
the con^>ressed data over the Inter-SoPEC Interface (ISI) bus. 

It may not be possible to print an A3 page every 2 seconds in this configuration since the USB 1 . 1 connec- 
tion to the host will only have enough bandwidth to supply 2Mbytes every 2 seconds. Pages which require 
more than 2MBytes every 2 seconds will therefore print slower. An alternative would be for each SoPEC 
or set of SoPECS on the same side of the page to have their own USB 1.1 connection. This would allow a 
faster average print speed. 
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7.2.5 SoPEC ORAM storage solution: A4 Simplex with 1 printing SoPEC and 1 memory SoPEC 



USB fram Host ^ 




SoPEC I SoPEC used 
Device 191 I as ORAM storage 



I prfnthead assembly 

^— —— — — — — — — —— — — — — — — — — — — — 

Figure 7. SoPEC A4 Simplex system with extra SoPEC used as DRAM storage 



high speed 
low speed 



Extra SoPECs can be used for DRAM storage e.g. in Figure 7 an A4 simplex printer can be built with a 
single extra SoPEC used for DRAM storage. The DRAM SoPEC can provide guaranteed bandwidth deliv- 
ery of data to the printing SoPEC. SoPEC configurations can have multiple extra SoPECs used for DRAM 
storage. 
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7.2.6 ISI-8ridge chip solution: A3 Duplex system with 4 SoPEC devices 



I replaceable 
t Ink cartridge 




Figure 8. A3 duplex system featuring four printing SoPECs 



In Figure 8, an ISI-Bridge chip provides slave-only ISI connections to SoPEC devices. Figure 8 shows a 
ISI-Bridge chip with 2 separate ISI ports. The ISI-Bridge chip is the ISIMaster on each of the ISI busses it 
is connected to. All connected SoPECs are ISlSlaves. The ISI-Bridge chip will typically have a high band- 
width connection to a host and may have an attached external DRAM for compressed page storage. 

An alternative to having a ISI-Bridge chip would be for each SoPEC or each set of SoPECs on the same 
side of a page to have their own USB 1.1 connection. This would allow a faster average print speed. 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 24 



SoPEC : Hardware Design 



8 Page Format and Printflow 



When rendering a page, the RIP produces a page header and a number of bands (a non-blank page requires 
at least one band) for a page. The page header contains high level rendering parameters, and each band 
contains compressed page data. The size of the band will depend on the memory available to the RIP, the 
speed of the RIP, and the amount of memory remaining in SoPEC while printing the previous band(s). Fig- 
ure 9 shows the high level data structure of a number of pages with different numbers of bands in the page. 



blank page 



singfe band page 



2 band page 



multi band page 



page header 



page header 



band 0 



page header 



bandO 



page header 



bandO 



bandl 



band n 



Figure 9. Pages containing different numbers of bands 

Each compressed band contains a mandatory band header, an optional bi-level plane, optional sets of inter- 
leaved contone planes, and an optional tag data plane (for Netpage enabled applications). Since each of 
these planes is optional', the band header specifies which planes are included with the band. Figure 10 
gives a high-level breakdoum of the contents of a page band. 



band n 



band header 



bMevei plane 



oontone plane 



tag data plane 



Figure 10. Contents of a page band 

A single SoPEC has maximum rendering restrictions as follows: 

• 1 bi-level plane 

• 1 contone interleaved plane set containing a maximum of 4 contone planes 

• 1 tag data plane 

• a bi-lithic printhead with a maximum of 2 printhead ICs 
The requirement for single-sided A4 single SoPEC printing is 

• average contone JPEG compression ratio of 10: 1 , with a local minimum compression ratio of 5: 1 for j 
single line of interleaved JPEG blocks. 



1 . Although a band must contain at least one plane 
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• average bi-level compression ratio of 10: 1 , with a local minimuiri compression ratio of 1 : 1 for a single 
line. 

If the page contains rendering parameters that exceed these specifications, then the RIP or the Host PC 
must split the page into a format that can be handled by a single SoPEC. 

In the general case, the SoPEC CPU must analyze the page and band headers and generate an appropriate 
set of register write commands to configure the units in SoPEC for that page. The various bands are passed 
to the destination SoPEC(s) to locations in DRAM determined by the host. 

The host keeps a memory map for the DRAM, and ensures that as a band is passed to a SoPEC, it is stored 
in a suitable free area in DRAM. Each SoPEC is connected to the ISI bus or USB bus via its Serial com- 
munication Block (SCB). The SoPEC CPU configures the SCB to allow compressed data bands to pass 
from the USB or ISI through the SCB to SoPEC DRAM. Figure 1 1 shows an example data flow for a page 
destined to be printed by a single SoPEC. Band usage information is generated by the individual SoPECs 
and passed back to the host. 



Host RIP 



page/band header 



bi- level plarie 



oontona intarteaved 
plane 



tag data plane 
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r T 
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register commands 



CPU 

r -I 



SoPEC'fi Registers 



Figure 11. Page data path from host to SoPEC 

SoPEC has an addressing mechanism that permits circular band memory allocation, thus facilitating easy 
memory management. However it is not strictly necessary that all bands be stored together. As long as the 
appropriate registers in SoPEC are set up for each band, and a given band is contiguous ^ the memory can 
be allocated in any way. 



I. Contiguous allocation also includes wrapping around in SoPEC's band store memory. 
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8.1 Print engine example page format 

This section describes a possible format of compressed pages expected by the embedded CPU in SoPEC. 
The format is generated by software in the host PC and interpreted by embedded software in SoPEC. This 
section indicates the type of information in a page format structure, but implementations need not be lim- 
ited to this format The host PC can optionally perform the majority of the header processing. 

The compressed format and the print engines are designed to allow real-time page expansion during print- 
ing, to ensure that printing is never interrupted in the middle of a page due to data underrun. 

The page format described here is for a single black bi*level layer, a contone layer, and a Netpage tag 
layer. The black bi-level layer is defined to composite over the contone layer. 

The black bi-level layer consists of a bitmap containing a 1-bit opacity for each pixel. This black layer 
mane has a resolution which is an integer or non-integer factor of the printer's dot resolution. The highest 
supported resolution is 1600 dpi, i.e. the printer's full dot resolution. 

The contone layer, optionally passed in as YCrCb, consists of a 24-bit CMY or 32-bit CMYK color for 
each pixel. This contone image has a resolution which is an integer or non-integer factor of the printer's 
dot resolution. The requirement for a single SoPEC is to support 1 side per 2 seconds A4/Letter printing at 
a resolution of 267 ppi, i.e. one-sixth the printer's dot resolution. 

Non-integer scaling can be performed on both the contone and bi-level images. Only integer scaling can be 
performed on the tag data. 

The black bi-level layer and the contone layer are both in compressed form for efficient storage in the 
printer's internal memory. 



8.1,1 Page structure 



A single SoPEC is able to print with full edge bleed for Letter and A3 via different stitch part combina- 
tions of the bi-lithic printhead. It imposes no margins and so has a printable page area which corresponds 
to the size of its paper. The target page size is constrained by the printable page area, less the explicit (tar- 
get) left and top margins specified in the page description. These relationships are illustrated below. 



target top margin 



target bottom margin 



- target page 

• printable page area 
(physical page) 



Figure 12. Page structure 
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8.1.2 Compressed page format 

Apart from being implicitly defined in relation to the printable page area, each page description is com- 
plete and self-contained. There is no data stored separately from the page description to which the page 
description refers. The page description consists of a page header which describes the size and resolution 
of the page, followed by one or more page bands which describe the actual page content 

8.1.2,1 Page header 

Table 3 shows an example format of a page header. 



Table 3. Page header fomnat 









signature 


16-bit integer 


Page header format signature. 


version 


16-bit integer 


Page header format version number. 


structure size 


iti^on inieger 


Size of page header. 


band count 


16-b(t integer 


Number of bands specified for this page. 


target resolution (dpi) 


1 S-bll Integer 


Resolutfon of target page. This is always 1 600 tor the iSiemjet 
printer. 


target page wklth 


16-brt Integer 


Width of target page, In dots. 


target page height 


32-bit integer 


l-(eight of target page. In dots. 


target left margin for black and 
oontone 


16-bft Integer 


Width of target left margin, in dots, for black and contone. 


target top margin tor black and 
contone 


16-bit Integer 


Height of target top margin, in dots, for black and contone. 


target right margin for black and 
contone 


16-bit integer 


Wkfth of target right margin, in dots, for black and confone. 


target bottom margin for black 
and contone 


16-bit integer 


Height of target bottom margin, in dots, for black and contone. 


target left margin for tags 


16-blt Integer 


Wkith of target left margin, in dots, for tags. 


target top margin for tags 


16-bit integer 


Height of target top margin. In dots, for tags. 


target right niargin for tags 


16-bit integer 


Width of target right margin, in dots, for tags. 


target. bottom margin for tags 


16-bjt integer 


Height of target bottom nr)argin, in dots, for tags. 


generate tags 


16-bft integer 


Specifies whether to generate tags for this page (0 - no, 1 - 
yes). 


fixed tag data 


128-bIt integer 


This is only valid if generate tags is set 


tag vertical scale factor 


16-bit integer 


Scale factor in vertical direction from tag data resolution to tar- 
get resolutfon. ValkJ range = 1-51 1. Integer scaling only 


tag horizontal scale tactor 


16-bit integer 


Scale factor in horizontal direction from tag data resolution to 
target resolution. Valid range = 1-511. Integer scaling only. 


bi-level layer vertical scale factor 


16-bit integer 


Scale factor in vertk^al direction from bi-level resolution to tar- 
get resoluUon (must be 1 or greater). May be non-integer. 
Expressed as a fraction with upper 8-bits the numerator and 
the fower 8 bits the denominator. 



1 . SoPEC relics on dither matrices and tag structures to have already been set up, but these are not considered to be part of a general vase 
format, it is tnviaJ to extend the page format to allow exact specification of dither matrices and tag structures. 
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Table 3. Page header format 



Hi 








bi-(evel fayer horizontal scale fac- 
tor 


16-bit integer 


Scale factor in horizontal direction from bi-ievel resolution to 
target resolution (must be 1 or greater). May be non-integer. 
Expressed as a fraction with upper e-bits the numerator and 
the lower 8 bits the denominator. 


bMevel layer page width 


16-bft integer 


Width of b(-levef layer page, in pixels. 


bMevel layer page height 


32-bit integer 


Height of bi-level layer page, in pixels. 


oontone ffags 


.16 bit integer 


Defines the color conversion that is required for the JPEG 
data. 

Bits 2-0 specify how many oontone planes there are (e.g. 3 for 
CMY and 4 for CMYK). 

BH 3 specifies whether the first 3 color planes need to be con- 
verted back from YCrCb to CMY. Only valid If b2-0 s 3 or 4. 

0 * no conversion, leave JPEG colors alone 

1 - color convert 

Bits 7-4 specifies whether the YCrCb was generated directly 
from CMY, or whether it was converted to RGB first via the 
step: R = 255-C. G = 255-M, B = 255-Y. Each of the color 
planes can be Individually inverted. 
Bit 4: 

0 - do not invert color plane 0 

1 • Invert color plane 0 
Bit 5: 

0 - do not invert color plane 1 

1 - invert color plane 1 
BH6: 

0 • do not invert color plane 2 

1 - Invert oolor plane 2 
Bit 7: 

0 - do not invert color plane 3 

1 - invert color plane 3 

Bit 8 specifies whether the contone data is JPEG compressed 
or non-oompressed: 

0 - JPEG compressed 

1 • non-compressed 

The remaining bits are reserved (0). 


contone vertical scale factor 


16-bit Integer 


Scale (actor in vertical direction from oontone channel resolu- 
tion to target resolution. Valid range = 1-255. May be non-inte- 
ger. 

Expressed as a fraction with upper 8-bits the numerator and 
the lower 8 bits the denominatof. 








contone horizontal scale factor 


1 6-bit integer 


Scale (actor in horizontal direction from contone channel reso- 
lution to target resolution. Valid range = 1-255. May be non- 
integer. 

Expressed as a fraction with upper 6-bits the numerator and 
the lower 6 bits the denominator. 


contone page width 


16-bit integer 


Width of contone page, in contone pbcels. 


contone page height 


d2-bit integer 


Height of contone page, in contone pixels. 


reserved 


up to 128 
bytes 


Reserved and 0 pads out page header to multiple of 128 
tjytes. 



TTic page header contains a signature and version which allow the CPU to identify the page header format 
If the signature and/or version arc missing or incompatible with the CPU. then the CPU can reject the 
page. 
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The contone flags define how many contone layers are present, which typically is used for defining 
whether the contone layer is CMY or CMYK. Additionally, if the color planes are CMY, they can be 
optionaUy stored as YCrCb, and ftirther optionally color space converted from CMY directly or via RGB. 
Finally the contone data is specified as being either JPEG compressed or non-compressed. 

The page header defines the resolution and size of the target page. The bi-level and contone layers are 
clipped to the target page if necessary. This happens whenever the bi-Ievel or contone scale factors are not 
factors of the target page width or height. 

The target left, top, right and bottom margins define the positioning of the target page within the printable 
page area. 

The tag parameters specify whether or not Netpagc tags should be produced for this page and what orien- 
tation the tags should be produced at (landscape or portrait mode). The fixed tag data is also provided. 
The contone, bi-level and tag layer parameters define the page size and the scale factors. 

6.1.2,2 Band format 

Table 4 shows the format of the page band header. 



Table 4. Band header format 









signature . 


16-bit integer 


Page t>arkd header format signature. 


versfon 


16-blt integer 


Page band header format version number. 


stiucture size 


16-bJt Integer 


Size of page band header. 


bHevel layer band height 


16-bit integer 


Height of bi-level layer band. In biack pbcels. 


tH-level layer band data size 


32-bit integer 


Size of bi-level layer band data, in bytes. 


contone band height 


16-bil integer 


Height of contone band. In contone pixels. 


contone band data size 


32-l>it integer 


Size of contone plane band data, in bytes. 


tag band height 


16-bit integer 


Height of tag band, In dots. 


tag band data size 


32-t^t integer 


Size of unenooded tag data t>and. in bytes. 
Can be 0 which Indicates that no tag data Is 
provided. 


reserved 


up to 128 
bytes 


Reserved and 0 pads out band header to 
multiple of 128 bytes. 



The bi-level layer parameters define the height of die black band, and the size of its compressed band data. 
The variable-size black data follows the page band header. 

The contone layer parameters define the height of the contone band, and the size of its compressed page 
data. The variable-size contone data follows the black data. 

The tag band data is the set of variable tag data half- lines as required by the tag encoder. The format of the 
tag data is foimd in Section 26.5.2. The tag band data follows the contone data. 

Table 5 shows the format of the variable-size compressed band data which follows the page band header. 
Table 5. Page band data format 









black data 


Modified G4 facsimile bitstream^ 


Compressed bi-level layer. 


contone data 


JPEG bytestream 


Compressed contone datalayer. 


tag data map 


Tag data array 


Tag data format See Section 26.5.2. 
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i. See section 8.1.2.3 on page 31 for note regarding the use of this standard 

The start of each variable-size segment of band data should be aligned to a 256-bit DRAM word boundaiy. 

The following sections describe the format of the compressed bi-level layers and the compressed contone 
layer, section 26.5.1 on page 365 describes the format of the tag data structuies. 

8, f . 2.3 BNevei data compression 

The (typically 1600 dpi) black bi-level layer is losslessly compressed using Silverbrook Modified Group 4 
(SMG4) compression which is a version of Group 4 Facsimile compression [18] without Huffinan and 
with simplified run length encodings. Typically compression ratios exceed 10:1. The encoding are listed in 
Table 6 and Table 7 



Table 6. BI-LeveJ group 4 facsimile style compression encodings 



mm 








Imlle 


1000 


Pass Command: aO 4- b2, skip next two edges 


1 


Vertica[(0): aO <- b1 , color = Icolor 


1 


110 


Vertlcal(l): aO *- b1 + 1 , color = tcotor 


Si 


010 


Vert]cal(-1): aO <- b1 - 1 , colof = Icolor 


ii 

So 


110000 


Vert]cal(2): aO 4- b1 + 2. color = Icolor 


010000 


Vertical<-2): aO <- b1 - 2, ootor = Icolor 


c 

M O 


100000 


Vertical(3): aO <- b1 + 3, color = lector 


€ S 


000000 


Vertica((-3}: aO <- 61 • 3. color = loolor 


S S 

s § 


<RL><RL>100 


Horizontal: aO <- aO -f <RL> + <RL> 




ft 









SMG4 has a pass through mode to cope with local negative compression. Pass through mode is activated 
by a special run-length code. Pass through mode continues to either end of line or for a pre-programmed 
number of bits, whichever is shorter. The special lun-length code is always executed as a nm-length code, 
followed by pass through. The pass through escape code is a medium length run-length with a run of less 
than or equal to 31. 



Table 7. Run length (RL) encodings 



mm 


m 








RRRRR1 


Short Black Runlength (5 bits) 




RRRRR1 


Short White Runlength (5 bits) 




RRRRRRRRRR10 


Medium Black Runlength (10 bits) 




RRRRRRRR10 


Medium White Runlength (8 bits) 




RRRRRRRRRR10 


Medium Black Runlength with RRRRRRRRRR <= 31. 


c 

tn o 






Enter pass through 


e to thi 
nentati 


RRRRRRRR10 


Medium White Runlength with RRRRRRRR <= 31 . 
Enter pass through 


If 


RRRRRRRRRRRRRRROO 


Long Black Runlength (1 5 bits) 




RRRRRRRRRRRRRRROO 


Long White Runiength (15 bits) 
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Since the compression is a bitstream, the encodings are read right (least significant bit) to left (most signif- 
icant bit). The run lengths given as RRRR in Table 7 arc read in the same way Oeast significant bit at the 
right to most significant bit at the left). 

Each band of bi-level data is optionally self contamed. The first line of each band therefore is based on a 
'previous* blank line or the last line of the previous band. 

8.1.2.3.1 Group 3 and 4 facsimile compression 

The Group 3 Facsimile compression algorithm [18] losslessly compresses bi-level data for transmission ♦ 
over slow and noisy telephone lines. The bi-level data represents scanned black text and graphics on a 
white background, and the algorithm is tuned for this class of images (it is explicitly not tuned» for exam- 
ple, for hai/toned bi-level images). The ID Group 3 algorithm ninlength-encodes each scanline and then 
Huffman-encodes the resulting runlengths. Runlengths in the range 0 to 63 are coded with terminating 
codes. Runlengths in the range 64 to 2623 are coded with make-up codes, each representing a multiple of 
64. followed by a terminating code. Runlengths exceeding 2623 are coded with multiple make-t?) codes 
followed by a terminating code. The Huffman tables are fixed, but arc separately tuned for black and white 
runs (except for make-up codes above 1728, which are common). When possible, the 2D Group 3 algo- 
rithm encodes a scanline as a set of short edge deltas (0. ±1 . ±2, ±3) with reference to the previous scan- 
line. The delta symbols are entropy-encoded (so that the zero delta symbol is only one bit long etc.) Edges 
within a 2D-encoded line which can't be delta-encoded are runlength-encoded, and are identified by a pre- 
fix. 1 D- and 2D-cncoded lines arc marked differently. ID-encoded lines are generated at regular intervals, 
whether actually required or not, to ensure that the decoder can recover from line noise with minimal 
image degradation. 2D Group 3 achieves compression ratios of up to 6: 1 [28]. 

The Group 4 Facsimile algorithm [ 1 8] losslessly compresses bi-lcvel data for transmission over error-free 
communications lines (i.e. the lines are truly error-free, or error-correction is done at a lower protocol 
level). The Group 4 algorithm is based on the 2D Group 3 algorithm, with the essential modification that 
since transmission is assumed to be error- free, ID-encoded lines are no longer generated at regular inter- 
vals as an aid to error-recovery. Group 4 achieves compression ratios ranging firom 20:1 to 60:1 for the 
CCnr set of test images [28] . 

The design goals and performance of the Group 4 compression algorithm qualify it as a compression algo- 
rithm for the bi-level layers. However, its Huffman tables are tuned to a lower scanning resolution (100- 
400 dpi), and it encodes runlengths exceeding 2623 awkwardly. 

Contone data compression 

The contone layer (CMYK) is either a non-compressed bytestream or is compressed to an interleaved 
JPEG bytestream. The JPEG bytestream is complete and self-contained. It contains all data required for 
decompression, including quantization and Huffman tables. 

The contone data is optionally converted to YCcCb before being compressed (there is no specific advan- 
tage in color-space converting if not compressing). Additionally, the CMY contone pixels are optionally 
converted (on an individual basis) to RGB before color conversion using R=255-C, G=255-M, B=255-Y. 
Optional bitwise inversion of the K plane may also be performed. Note that this CMY to RGB conversion 
is not intended to be accurate for display purposes, but rather for the purposes of later converting to 
YCrCb. The inverse transform will be applied before printing. 

8.1.2.4.1 JPEG compression 

The JPEG compression algorithm [23] lossily compresses a contone image at a specified quality level. It 
introduces imperceptible image degradation at compression ratios below 5:1, and negligible image degra- 
dation at compression ratios below 10:1 [29]. 
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JPEG typically first transforms the image into a color space which separates luminance and chrominance 
into separate color channels. This allows the chrominance channels to be subsampled without appreciable 
loss because of the human visual system's relatively greater sensitivity to luminance than chrominance. 
After this first step, each color channel is compressed separately. 

The image is divided into 8x8 pixel blocks. Each block is then transformed into the frequency domain via 
a discrete cosine transform (DCT). This transformation has the effect of concentrating image energy in rel- 
atively lower-frequency coefficients, which allows higher-frequency coefficients to be more cmdely quan- 
tized. This quantization is the principal source of compression in JPEG. Further compression is achieved 
by ordering coefficients by frequency to maximize the likelihood of adjacent zero coefficients, and then 
runlength-encoding runs of zeroes. Finally, the runlengths and non-zero frequency coefficients are entropy 
coded. Decompression is the inverse process of compression. 

8.1.2.4.2 Non^compressed format ' 

If the contone data is non-compressed, it must be in a block-based format bytestream with the same pixel 
order as would be produced by a JPEG decoder. The bytestream therefore consists of a series of 8x8 block 
of the original image, starting with the top left 8x8 block, and working horizontally across the page (as it 
will be printed) until the top rightmost 8x8 block, then the next row of 8x8 blocks Oeft to right) and so on 
until the lower row of 8x8 blocks (left to right). Each 8x8 block consists of 64 8-bit pixels for color plane 
0 (representing 8 rows of 8 pixels in the order top left to bottom right) followed by 64 8-bit pixels for color 
plane 1 and so on for up to a maximum of 4 color planes. 

If the original image is not a multiple of 8 pixels in X or Y, padding must be present (the extra pixel data 
will be ignored by the setting of margins). 

8.1 .2.4.3 Compressed format 

f f the contone data is compressed the first memory band contains JPEG headers (including tables) plus 
MCUs (minimum coded imits). The ratio of space between the various color planes in the JPEG stream is 
1:1:1:1. No subsampling is permitted. Banding can be completely arbitrary i.e there can be multiple JPEG 
images per band or 1 JPEG image divided over multiple bands. The break between bands is oiAy memory 
alignment based. 

8.1.2.4.4 Conversion of RGB to YCrCb (in RIP) 

YCrCb is defined as per CCIR 601-1 [20] except that Y, Cr and Cb are normalized to occupy all 256 levels 
of an 8-bit binary encoding and take account of the actual hardware implementation of the inverse trans- 
form within SoPEC 

The exact color conversion computation is as follows: 

• Y* = (9805/32768)R + (19235/32768)G + (3728/32768)B 

• Cr* = (16375/32768)R - (137 16/32768)G - (2659/32768)8 + 128 

• Cb* = -(5529/32768)R- (10846/32768)G + (16375/32768)B + 128 

Y, Cr and Cb are obtained by rounding to the nearest integer. There is no need for saturation since ranges 
of Y*, Cr* and Cb* after rounding are [0-255], [1-255] and [1-255] respectively. Note that Juil accuracy is 
possible with 24 bits. See [14] for more information. 
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9 Overview 

The Small Office Home Office Print Engine Controller (SoPEC) is a page rendering engine ASIC that 
takes compressed page images as input, and produces decompressed page images at up to 6 channels of bi- 
level dot data as output. The bi-level dot data is generated for the Memjet bi-lithic printhead. The dot gen- 
eration process takes accovmt of printhead construction, dead nozzles, and allows for fixative generation. 

A single SoPEC can control 2 bi*lithic printheads and up to 6 color channels at 10,000 lines/sec^ , equating 
to 30 pages per minute. A single SoPEC can perform full-bleed printing of A3, A4 and Letter pages. The 6 
channels of colored ink are the expected maximum in a consumer SOHO, or office Bi-lithic printing envi- 
ronment: 

• CMY, for regular color printing, 

• K, for black text, line graphics and gray-scale printing. 

• IR (infrared), for Netpage-enabled [5] applications. 

• F (fixative), to enable printing at high speed . Because the bi-lithic printer is capable of printing so fast, 
a fixative may be required to enable the ink to diy before the page touches the page already printed! 
Otherwise the pages may bleed on each other. In low speed printing environments the fixative may not 
be required. 

. SoPEC is color space agnostic. Although it can accept contone data as CMYX or RGBX, where X is an 
optional 4th channel, it also can accept contone data in any print color space. Additionally. SoPEC pro- 
vides a mechanism for arbitrary mapping of wpixt channels to output channels, including combining dots 
for ink optimization, generation of channels based on any number of other channeb etc. However, inputs 
are typically CMYK for contone input, K for the bi-level input, and the optional Netpage tag dots are typ- 
ically rendered to an infi^-red layer. A fixative channel is typically generated for fast printing applications. 

SoPEC is resolution agnostic. It merely provides a mapping betwera input resolutions and output resolu- 
tions by means of scale factors. The expected output resolution is 1600 dpi, but SoPEC actually has no- 
knowledge of the physical resolution of the Bi-lithic printhead. 

SoPEC is page-length agnostic. Successive pages are typically split into bands and downloaded into the 
page store as each band of infomiation is consumed and becomes free. 

SoPEC provides an interface for synchronization with other SoPECs. This allows simple multi-SoPEC 
solutions for simultaneous A3/A4/Letter duplex printing. However, SoPEC is also enable of printing only 
a portion of a page image. Combining synchronization functionality with partial page rendering allows 
multiple SoPECs to be readily combined for alternative printing requirements including simultaneous 
duplex printing and wide format printing. 

Table 8 lists some of the features and corresponding benefits of SoPEC. 



Table 8. Features and Benefits off SoPEC 







Optimrsed print architecture in hardware 


30ppni full page photographic quality ootor printino 
from a desictop PC 


0.1 3micron CMOS 
(>3 million transistors) 


High speed 

Low cost 

High functionality 



1. 10,000 lines per second equates to 30 A4/Letter pages per minute at 1600 dpi 
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Table 8. Features and Benefits of SoPEC 







900 Million flotfi nnr spcnnH 


Extremely fiast page generation 


1 0 000 lines ner ^atwiH nt l A/v> Hni 


0.5 A4/Letter pages per SoPEC chip per second 


1 Chip drives up to 133.920 nozzles 


Low cost page-width printers 


1 ctiip drives up to 6 color planes 


99% of SoHo printers can use 1 SoPEC device 


Integrated ORAM 


No external memory required, leading to low cost 

systems 


Row^r saving sleep mode 


SoPEC can enter a power saving sleep mode to 
reduce power dissipation between print jobs 


JPEG expansion 


Low bandwidth from PC 

Low memory requirements in printer 


Lossless bltplane expansion 


High resolution t^ and line art vMi tow bemdwidth 
from PC (e.g. over USB) 


Netpage tag expansion 


Generates interactive paper 


Stochastic dispersed dot dither 


Optically smooth Image quality 
No moire effects 


Hafdware oomposHor for 6 image planes 


Pages composited In real-time 


Dead nozzle compensation 


Extends printhead life and yield 
Reduces printhead cost 


Color space agnostic 


Compatible wHh all inksets and image sources 
Inducfing RGB, CMYK, spot, CIE L*a*b*, hex- 
achrome, YCiCbK, sRGB and other 


Color fifWAfl /v>nunpel^\ft 


Higher quality / lower Ijandwidth 


Computer intertace 


USB1.1 interface to Host and iSI interface to ISt- 
Bridge chip thereby alfowing connection to IEEE 
1394. Bluetooth etc* 


Cascadable in resolution 


Printers of any resdutfon 


Cascadabie fn color depth 


Special color sets e.a hexachrDma can Ha iusaH 


Cascadable in image size 


Printers of any width up to 16 inches 


Cascadable in pages 


Printers can print both skies simultaneously 


Cascadable in speed 


Higher speeds are possible t>y having each SoPEC 
print one vertical strip of the page. 


Fixative channel data generatton 


Extremely fast ink drying without wastage 


Built-in security 


Revenue models are protected 


Undercolor removal on dot-by-dot basis 


Reduced ink usage 


Does not require fonts for high speed 
operation 


No font substltutfon or missing fonts 


Flexible printhead configuration 


Many configurations of printheads are supported by 
one chip type 


Drives Bi-iithic printheads directly 


No print driver chips required, results in lower cost 


Determines dot accurate Ink usage 


Removes need tor physical Ink monitoring system In 
ink cartridges 



9.1 Printing RATES 

The required printing rate for SoPEC is 30 sheets per minute with an inter-sheet spacing of 4 
achieve a 30 sheets per minute print rate, this requires: 
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Si 



300mm x 63 (dot/mm) / 2 sec - 1 05.8 iiseconds per line, with no inter-sheet gap, 
340mm x 63 (dot/mm) / 2 sec = 93.3 ^seconds per line, with a 4 cm inter-sheet gap. 

wo"T!o.^^^' ^ P^^"" ^"""^'^ ^^^2^ ^^^^^ [2]. At a system clock rate of 160 

MHz 13824 dots of data can be generated in 86.4 jiseconds. Therefore data can be generated fast enough 
to meet the pnntmg speed requirement It is necessary to deUver this print data to the print-heads. 
Printheads can be made up of 5:5, 6:4, 7:3 and 8:2 inch printhead combinations [2]. Print data is tnsns- 
ferred to both pnnt heads m a pair simultaneously. This means the longest time to print a line is determined 
by tihe ttme to transfer print data to the longest print segment. There are 9744 nozzles across a 7 inch print- 
head. The pnnt data is transferred to the printhead at a rate of 106 MHz (2/3 of the system clock rate) per 
color plane. This means that it wiU take 91.9 us to transfer a single line for a 7:3 printhead configurarion 
So we can meet the requirement of 30 sheets per minute printing with a 4 cm gap with a 7:3 printhead 
combination. There are 11 1 60 across an 8 inch printhead To transfer the data to the printhead at 1 06 MHz 
will take 1053 [is. So an 8:2 printhead combination printing with an inter-sheet gap will print slower than 
30 sheets per minute. 



9.2 SoPEC BASIC ARCHITECTURE 



From the highest point of view tfie SoPEC device consists of 3 distinct subsystems 

• CPU Subsystem 

• DRAM Subsystem 

• Print Engine Pipeline (PEP) Subsystem 

Sec Figure 1 3 for a block level diagram of SoPEC. 



9.2.1 CPU Subsystem 



The CPU subsystem controls and configures all aspects of the other subsystems. It provides general sup- 
port for mterfacing and synchronising the external printer with the internal print engine. It also controls the 
low speed communication to the QA chips. The CPU subsystem contains various peripherals to aid the 
CPU. such as GPIO (includes motor control), intcmipt controUer, LSS Master and general timeis. The 
Serial Communications Block (SCB) on the CPU subsystem provides a full speed USBl.l interface to the 
Host as well as an Inter SoPEC Interface (ISI) to other SoPEC devices. 

9.2.2 DRAM Subsystem 

The DRAM subsystem accepts requests from the CPU, Serial Communications Block (SCB) and blocks 
within the PEP subsystem. The DRAM subsystem (in particular the DIU) arbitrates the various requests 
and determines which request should win access to the DRAM. The DIU arbitrates based on configured 
parameters to aUow sufficient access to DRAM for aU requestors. The DIU also hides the implementation 
specifics of the DRAM such as page size, number of banks, refresh rates etc. 

9.2.3 Print Engine Pipeline (PEP) subsystem 

The Print Engine Pipeline (PEP) subsystem accepts compressed pages from DRAM and renders them to 
bi-Ievel dots for a given print line destined for a printhead interface that communicates directly with un to 
2 segments of a bi-lithic printhead. 

The fiist stage of the page expansion pipeline is the CDU. LBD and TE. The CDU expands the JPEG-com- 
pressed contone (typically CMYK) layer, the LBD expands the compressed bi-level layer (typically K) 
and the TE encodes Netpage tags for later rendering (typicaUy in IR or K ink). The output from the first 
stage IS a set of buffers: the CFU, SFU. and TFU, The CPU and SFU buffeis are implemented in DRAM 
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The second stage is the HCU, which dithers the contone layer, and composites position tags and the bi- 
level spotO layer over the resulting bi-level dithered layer. A number of options exist for the way in which 
compositing occurs. Up to 6 channels of bi-levcl data are produced from this stage. Note that not all 6 
channels may be present on the printhead. For example, the printhead may be CMY only, with K pushed 
into the CMY channels and IR ignored. Alternatively, the position tags may be printed in K if IR ink is not 
available (or for testing purposes). 

The third stage PNC) compensates for dead nozzles in the printhead by color redundancy and error dif- 
fusing dead nozzle data into surrounding dots. 

The resultant bi-level 6 channel dot^^lata (typically CMYK-IRF) is buffered and written out to a set of line 
buffers stored in DRAM via the DWU. 

Finally, the dot-data is loaded back from DRAM, and passed to the printhead interface via a dot FIFO. The 
dot FIFO accepts data from the LLU at the system clock rate {pclk% while the PHI removes data from the 
I FIFO and sends it to the printhead at a rate of 2/3 times the system clock rate (see Section 9.1). 
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Figure 13. SoPEC System Top Level partition 
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9.3 SoPEC Block Description 

Looking at Figure 13, the various units are described here in summary form: 



Table 9. Units within SoPEC 











ORAM 


OIU 


ORAM interfece unit 


Provides the tnterfaoe for DRAM read and write access 
for the various SoPEC units. CPU and the SCB block. 
The DIU orovidss ariiitratfan hAtwaan comnAtinn iin«ta 

controls DRAM access. 




DRAM 


Embedded DRAM 




CPU 


CPU 


Central Processing Unit 


CPU for system configuratfon and control 




MMU 


Memoty Management Unit 


UnUts access to certain memory address areas in CPU 
user mode 




ROU 


Real-tlnne Debug Unit 


Facilitates the observatfon of the contents of most of the 
CPU addressable registers in SoPEC in addition to 
some pseudo-registers in reattime. 




TIM 


GeneraJ Timer 


Contains watchdog and general system timers 




LSS 


Low Speed Serial Interfaces 


Low level controller for tntertacing with the OA chips 




GPIO 


GeneraJ Purpose lOs 


General lO oontrotler, with buih-in Motor control unit. 
LEO pulse units and de-glitch circuitry 




ROM 


Boot ROM 


16 KBytes of System Boot ROM code 




ICU 


Interrupt ControUer Unit 


General Purpose interrupt controller with configurable 
priority, and masking. 




CPR 


Clock. Power and Reset 
block 


Central Unit for controlling and generating the system 
clocks and resets and powerdown mechanisms 




PSS 


Power Save Storage 


Storage retained while system is powered down 




USB 


Universal Serial Bus Device 


USB d6vk:e controller for Interfacing with the Host USB. 




ISI 


Inter-SoPEC Interface 


' ISI controller for data and control communication with 
other SoPEC's in a multi-SoPEC system 




SCB 


Serial Communication Block 


Contains both the USB and ISI btocks. 



Doc: SoPEC_hardvvare_design. 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 40 




SoPEC : Hardware Design 



Table 9. Units within SoPEC 





mm 






Print Engine 

Pipeline 

(PEP) 


PCU 


PEP oontrotler 


Provfdes external CPU with the means to read and write 
PEP Unit registers, and read and write DRAM In single 
sa-bit chunks. 




CDU 


Contone decoder unit 


Expands JPEG compressed contone layer and writes 
decompressed contone to DRAM 




CFU 


Contone FJFO Unit 


Provides line buflering between CDU and HCU 




LBD 


Lossless Bi-level Decoder 


Expands compressed bi-level layer. 




SFU 


Spot FiFO Unit 


Provides line txiffering between LBD and HCU 




TE 


Tag encoder 


Encodes tag data into One of tag dots. 




TFU 


Tag FIFO Unit 


Provides tag data storage between TE and HCU 




HCU 


Hatftoner compositor unit 


Dithers contone layer and composites the bi-level spot 0 
and position tag dots. 




DNC 


Dead Nozzle Compensator 


Compensates for dead nozzles by color redundancy and 
error diffusing dead nozzle data into surrounding dots. 




OWU 


Dotline Writer Unit 


Writes out the 6 channels of dot data for a given Printline 
to the line store DRAM 




LLU 


Une Loader Unit 


Reads the expanded page image from line store, format- 
ting the data appropriately for the bi-Uthic printhead. 




PHI 


PrintHead Interface 


is responsible for sending dot data to the bi^lthic print- 
heads and for providing Une syr\chronization between 
muIUple SoPECs. Also provfdes test interfoce to print- 
head such as temperature monitoring and Dead Nozzle 
Identification. 



9.4 Addressing scheme in SoPEC 

SoPEC must address 

• 20 Mbit DRAM. 

• PCU addressed registers in PEP. 

• CPU-subsystem addressed registers. 

SoPEC has a unified address space with the CPU capable of addressing all CPU-subsystem and PCU-bus 
accessible registers (in PEP) and all locations in DRAM. The CPU generates byte-aligned addresses for 
the whole of SoPEC. 

22 bits are sufficient to byte address the whole SoPEC address space. 

9.4.1 DRAM addressing scheme 

The embedded DRAM is composed of 256-bit words. However the CPU-subsystem may need to write 
individual bytes of DRAM. Therefore it was decided to make the DIU byte addressable. 22 bits are 
required to byte address 20 Mbits of DRAM. 

Most blocks read or write 256-bit words of DRAM. Therefore only the top 17 bits i.e. bits 21 to 5 are 
required to address 256-bit word aligned locations. 

The exceptions are 

• CDU which can write 64-bits so only the top 19 address bits i.c. bits 21-3 are required. 

• The CPU-subsystem always generates a 22-bit byte-aligned DIU address but it will send flags to the 
DIU indicating whether it is an 8, 16 or 32-bit write. 
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All DIU accesses must be within the same 256-bit aligned DRAM word 
9.4.2 PEP Unit ORAM addressing 



i3 



PEP Umt configuration registers which specify DRAM locations should specify 256-bit aligned DRAM 
addresses i.e. using address bits 21:5. Legacy blocks from PECl e.g. the LBD and TE may need to specify 
64-bit aligned DRAM addresses if these reused blocks DRAM addressing is difficult to modify These 64- 
bit aligned addresses require address bits 21:3. However, these 64.bit aligned addresses should be pro- 
grammed to start at a 256-bit DRAM word boundary. 

Unlike PECl, there are no constraints in SoPEC on data oiganization in DRAM except that all data struc- 
tures must start on a 256-bit DRAM boundary. If data stored is not a multiple of 256.bits then the last word 
should be padded. 



9.4.3 CPU-bus addressed registers 



The CPU-bus supports 32-bit word aligned read and write accesses with variable access timings. See sec- 
Uon 1 1.4 for more details of the access protocol used on this bus. The CPU-bus docs not currently smroort 
byte reads and writes but this can be added at a later date if required by imported IP, 



9.4.4 PCU addressed registers in PEP 



The PCU only supports 32-bit register reads and writes for the PEP blocks. As the PEP blocks only occupy 
a nibsection of the overaU address map and the PCU is explicitly selected by the MMU when a PEP block 
is bemg accessed the PCU does not need to perforai a decode of the higher-order address bits See 
Table 1 1 for the PEP subsystem address map. 
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9.5 SoPEC Memory Map 



9.5.1 Main memory map 



The system wide memory map is shown in Figure 14 below. The memory map is discussed in detail in 
Section 1 1 Central Processing Unit (CPU). 



Accesses in this 
area are not 
allowed and 
result in a bus 
error exception. 



Accesses in this 
area are via the 
CPU bus and are 
controlled by 
permissions set in ^ 
each peripheral. 



Accesses in this 
area are via the 
OIU bus and are 
controlled by 
permissions set in^ 
theMMU. 




OxFFFF^FFFF 



PCU Mapped Registers 



Peripheral Registers 



ROM 



DRAM 



0x002A^C000 
OX002A.OOOO 
0x0029.0000 
0x0028^0000 




DRAM 
Regions 



0x0000^0000 



Figure 14. Proposed SoPEC CPU memory map (not to scale) 

9.5.2 CPU-bus peripherals address map 

The address mapping for the peripherals attached to the CPU-bus is shown in Table 10 below. The MMU 
performs the decode of cpu__adr[2I:J2J to generate the relevant cpujblock jelect signal for each block. 
The addressed blocks decode however many of the lower order bits of cpu_adr[ll:2] arc required to 
address all the registers within the block. 



Table 10, CPU-bus peripherals address map 







MMUl.base 


Ox0029_0000 


TlM_base 


0x0029_1000 


LSS.base 


0x0029.2000 


GPIO_base 


0x0029_3000 


SCB^base 


0x0029.4000 
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Table 10.CPU-bua peripherals address map 



ICU_base 


0x0029.5000 


CPR^base 


0x0029.6000 


ROM^base 


0x0029^7000 


DlUJbase 


0x0029_8000 


PSS.base 


0x0029^9000 


Reserved 


0x0O29_A00O to Ox0029_FFFF 


PCU^base 


0x002A.0000 to Ox002A.BFFF 



9.5.3 



PCU Mapped Registers (PEP blocks) address map 

J!^^^^*^ addressed via the PCU. From Figure 14. the PCU mapped registers are in the range 
te002A^0000to0x002A^BFFF. From Table 11 it can be seen that there are 12 sub-blocks within the PCU 
address space. Therefore, only four bits are necessary to address each of the sub-blocks within the PEP 
part of SoPEC. A further 1 2 bits may be used to address any configurable rcigister within a PEP block. This 
gives scope for 1024 configurable registers per sub-block (the PCU mapped registers are all 32-bit 
addressed registers so the upper 10 bits are tequiied to individuaUy address them). This address will come 
either ftom the CPU or from a command stored in DRAM. The bus is assembled as follows* 

• addiess[15:12] = sub-block address, . 

• address[n:2J = register address vwthin sub-block, only the number of bits required to decode the reeis- 
ters within each sub-block are used, 

• address[l :0] = byte address, unused as PCU mapped registers are all 32-bit addressed registers. 

So for the case of the HCU, its addresses range from 0x7000 to 0x7FFF within the PEP subsystem or from 
<ht002A^7000 to 0x002A^7FFFF in the overall system. morirom 

Table 1 1 . PEP blocks address map 





mmmmmm 


PCU.tMise 


0x0O2A_0000 


COU.base 


Ox002A_1000 


CFU.bose 


0x002A_2000 


LBO.base 


OX002/V.3000 


SFU_base 


Ox0O2A_4000 


TE_base 


Ox002A_5000 


TFU^base 


OxOO2A_6O00 


HCU.base 


0x002A_7000 


DNC_base 


0x002A.8000 


OWU_base 


Ox002A_9000 


LLU^base 


0XOO2AJV000 


PHLbase 


OxOO2A^B000 to 0x002A^BFFF 
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9.6 Buffer MANAGEMENT IN SoPEC 

As outlined in Section 9.1. SoPEC has a requirement to print 1 side cvety 2 seconds i e 30 sides ner 
minute - psi 



9.6.1 Page buffering 

Approximately 2 Mbytes of DRAM are reserved for compressed page buflfeiing in SoPEC If a page is 
compressed to fit within 2 Mbyte then a complete page can be transferred to DRAM before printing How- 
ever, the time to transfer 2 Mbyte using USB l.l is approximately 2 seconds. The worst case cycle time to 
pnnt a page then approaches 4 seconds. This reduces the worst^case print speed to 1 5 pages per minute. 

9.6.2 Band buffering 

The SoPEC page-expansion blocks support the notion of page banding. The page can be divided into 
bands and another band can be sent down to SoPEC while we are printing the current band. 
Therefore we can start printing once at least one band has been downloaded. 

eranviaiity should be carefully chosen to allow efficient use of the USB bandwidth and 
DRAM buflFer space. It should be small enough to aUow seamless 30 sides per minute printing but not so 
smaU as to mtroduce excessive CPU overhead in orchestrating the data transfer and parsing the band head- 
ers. Band-finish mtemipts have been provided to notify the CPU of free buffer space. It is likely that the 
Most PC wiU supervise the band transfer and buffer management instead of the SoPEC CPU. 
If SoPEC starts printing before the complete page has been transferred to memory there is a risk of a buffer 
undemm occurring if subsequent bands are not transferred to SoPEC in time e.g. due to insufficiem USB 
bandwidth caused by another USB peripheral consuming USB bandwidth. A buffer undcmin occurs if a 
hne synchromsimon pulse is received before a line of data has been transferred to the printhead and causes 
the pnnt j ob to fea at that line. If there is no risk of buffer underrun then printing can safely start once at 
least one band has been downloaded. «- » j -uwcm 

If thCTB is a risk of a buffer undeiiun occurring due to an interruption of compressed page data transfer 
then the safest approach is to only start printing once we have loaded up the data for a complete page. This 
means that a wow case latency in the region of 2 seconds (with USBl.l) wiU be incurred before printing 

^f-T^r^''- P«8es vviU take 2 seconds to print giving us the required sustained printhig rate 

oi JO sides per zninute. ^ r & 

A Storage SoPEC (Section 7.2,5) could be added to the system to provide guaranteed bandwidth data 
delivery. The pnnt system could also be constructed using an ISI-Bridgc chip (Section 7.2.6) to provide 
guaranteed data delivery. ^ w/ w ^xv/viuc 

P'n^^^ n®^''^^ P""^^ ^^"^^^ ^^""^ ^ ^'^^^y detemiined on a per page/ print job basis and so 
SoPEC will support the use of bands of any size. 
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1 0 SoPEC Use Cases 

10.1 Introduction 

This chapter is intended to give an overview of a representative set of scenarios or use cases which SoPEC 
can perform, SoPEC is by no means restricted to the particular use cases described here. 
In this chapter we discuss SoPEC use cases under four headings: 

1) Normal operation use caseis. 

2) Security use cases. 

3) Miscellaneous use cases. 

4) Failure mode use cases. 

Use cases for both single and multi-SoPEC systems are outlined. 
Some taslcs may be composed of a number of sub-tasks. 

The realtime requirements for SoPEC software tasks are discussed in "Central Processing Unit rCPLH" 
under Section 1 1 .3 Realtime requirements. ^ \ j 

10.2 Normal operation in a single SoPEC System with USB Host connection 

^^'^SJ?^®^^" ^ "'^^ ^ number of sections which are outlined below. Buffer management in 

a SoPEC system is normally performed by the Host. 

10.2.1 Powerup 

Powerup describes SoPEC initialisation foUowing an external reset or the watchdog timer system reset. 
A typical powenip sequence is: 

1 ) Execute reset sequence for complete SoPEC. 

2) CPU boot ftom ROM. 

3) Basic configuration of CPU peripherals, SCB and DIU. DRAM initialisation. USB Wakeup. 

4) Download and authentication of program (see Section 10.5.2). 

5) Store reusable authentication results in Power-Safe Storage (PSS). 

6) Execution of program from DRAM. 

7) Retrieve operating parameters from PRINTER^QA and authenticate operating parameters. 

8) Download and authenticate any furthen/arorens. 

10.2.2 USB wakeup 

The CPU can put different sections of SoPEC into sleep mode by writing to registers in the CPR block 
(chapter 16). Normally the CPU sub-system and the DRAM wiU be put in sleep mode but the SCB and 
power-safe storage (PSS) will still be enabled 

Wakeup describes SoPEC recovery from sleep mode witii the SCB and power-safe storage (PSS) still 
enabled In a smgle SoPEC system, wakeup can be initiated following a USB reset fix)m tiie SCB. 
A t>rpical USB wakeup sequence is: 

1) Execute reset sequence for sections of SoPEC in sleep mode. 

2) CPU boot from ROM, if CPU-subsystem was in sleep mode. 

3) Basic configuration of CPU peripherals and DIU. and DRAM initialisation, if required. 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2(X)2 
Page 46 



SoPEC : Hardware Design 



J3 



loT?*** authentication of program using results in Power-Safe Storage (PSS) (see Section 

5) Execution of program from DRAM. 

6) Retrieve operating parameters froin PRINTER^QA and authenticate operating parameters 

7) Download and authenticate using results in PSS of any fiuther datasets (programs). 

1 0.2.3 Print initialization 

This sequence is typically performed at the start of a print joh following powerup or wakeup- 

1) Check amount of ink remaining via QA chips. 

2) Download static data e,g. dither matrices, dead nozzle tables from Host to DRAM. 
S^rdiSgly*^^ temperature, if required, and configure printhead with firing pulse profile etc. 

4) Initiate printhead pfre-hcat sequence, if required. 

10.2.4 First page download 

Buffer management in a SoPEC system is nonnally performed by the Host. 
First page, first band download and processing: 

1) The Host communicates to the SoPEC CPU over the USB to check that DRAM space remaining is 
sufficient to download the first band ^ 
I 2) The Host downloads the first band (with the page header) to DRAM. 

I ^^"l ^IJP^^^ ^ ^^^"^ downloaded the SoPEC CPU processes the page header 

I calculates PEP register commands and writes directly to PEP registers or to DRAM. 

POJ ^ ^^^^^^ commands have been written to DRAM, execute PEP commands from DRAM via 

Remaining bands download and processing: 

1) Check DRAM space remaining is sufficient to download the next band. 

2) Download the next band with the band header to DRAM. 

3) When the complete band header has been downloaded, process the band header according to 
whichever band-related register updating mechanism is being used. 

10.2.5 Start printing 

1) Wait until at least one band of the first page has been downloaded. 

One approach is to only start printing once we have loaded up the data for a complete page If we 
start pnnting before the complete page has been transferred to memory we nm the risk of a buffer 
undemm occurring because compressed page data was not transferred to SoPEC in time e g due to 
insufficient USB bandwidth caused by another USB peripheral consuming USB bandwidth. 

2) Start aU the PEP Units by writing to their Go registers, via PCU commands executed from DRAM 
or direct CPU wntes. A rapid startup order for the PEP units is outlined in Table 1 2. 



Table 12. Typical PEP Unrt startup order for printing a page. 





mm 








tHMC " 


2 


DWU 


3 


HCU " '~ ' 
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Table 12. Typical PEP Untt startup order for printing a page. 



S5 



LLU 



CFU, SRJ. TFU 



CDU 
TE, LBD 



10.2.6 



3) Print ready interrupt occurs (fh>m PHI). 

5) Drive LEDs, monitor paper status. 

6) Wait for page alignment via page sensor(s) GPIO intemipt 

8) Continue to dowiUoadbancb and process ixige and band headers for nact page. 
Next page(s) download 

As for first page download, perfonned during printing of cuirent page. 

10.2.7 Between bands 

When the fimshed band flags are asserted band related registers in the CDU. LBD TE need to be re nnv 
cally only 3-5 commands per decompression unit need to be executed These reaisSs SrTai^?^ 

Stn^^th^cpi ^Cu ^?^PUS^^^^ =*arow?eSrr«S^tai; 
nag mternipts the CPU to teU the CPU that the area of memory associated with the band is now fiee. 

10.2.8 During page print 

Typic^y during page printing ink usage is communicated to the OA chins 

1) Calculate ink printed (from PHI). 

2) Decrement ink remaining (via QA chips). 

3) q,eck amount of ink remaining (via Q A cUps). This operation may be better performed while the 
page IS being pnnted rather than at the end of the page. pcnonnea wjuie uie 

10.2.9 Page finish 

These operations are typically performed when the page is finished: 

1) Page finished interrupt occurs from PHI. 

2) Sh^clown the PEP blocks by de-asserting their Go register. A typical shutdown order is defined in 
u^tlon reSi:r ^^^^^ »<> ^ without resetting theirSg 

3) Communicate ink usage to QA chips, if required. 
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Table 13. End of page shutdown order for PEP Units (TBO). 





1 


PHI (will shutdown by rtseif In tfie normaJ case at the end of a page) 


2 


DWU (shutting this down stalls the ONC and therefore th© HCU and above) 


3 


LLU (should already be halted due to. PHI at end of last line of page) 


4 


TE (this is the only dot supplier lllcely to be running, halted by the HCU) 


5 


CDU (this is likely to already be halted due to end of contone band) 


6 


CPU, SFU, TFU, \JBD (order unimportant, and should already be halted due to end of 
band) 


7 


HCU, DNC (order unimportant should already have halted) 



1 0.2.1 0 Start of next page 

These operations are typically performed before printing the next page: 

1) Re-program the PEP Units via PCU command processing from DRAM based on page header. 

2) Go to Start printing. 

1 0.2.1 1 End of document 

1) Stop motor control. 

10.2.12 Powerdown 



In this mode SoPEC is no longer powered. 

1) Instruct Host PC via USB that SoPEC is about to power down. 



10.2.13 Sleep 



The CPU can put different sections of SoPEC into sleep mode by writing to registeis in the CPR block 
described in Section 16. 

1) Instruct Host PC via USB that SoPEC is about to sleep. 

2) Put SoPEC into defined sle^ mode. 



Doc: SoPEC_hardware_design 

Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 49 



SoPEC : Hardware Design 



S5 



10.3 Normal operation in a Multi-SoPEC System - ISIMaster SoPEC 

??pp^ Host generally manages program and compressed page download to all d,e 
:>oPfc,Cs. Inter-SoPEC commumcation is over the ISI link which will add a latency. 

lL'?I.S^l°^*T^"15;l''^f ^ * connection, the SoPEC with the USB connection is 

the ISIMaster. The ISI-bndge chip is the ISIMaster in the case of an ISI-Bridge SoPEC configuraSn 

In a multi-SoPEC system one of the SoPECs will be the PrintMaster. This SoPEC must manage and con- 
frol aoisors and actuators e.g. motor control. These sensors and actuators could be distributed over all the 
SoPECs m the system. An ISIMaster SoPEC may also be the PrintMaster SoPEC. 
In a """Iti-SoPEC^stan each priming SoPEC will generally have its own PRINTER^QA chip (or at least 
^c«s to a PRINTER_QA chip that contains the SoPECs SOPECJd.key) to valic^ operat^ p^ 
ters and mk usage. The results of these operations may be communicated to the PrintMaster Sorec. 
In goieral the ISIMaster may need to be able to: 

• Send messages to the ISISIaves which will cause the ISISlaves to send their status to the ISIMaster 

* Instruct the ISISIaves to perform certain operations. 

^n^^^^ *!? ^"'^'^ commands issued over the ISI are regarded as user made commands. 

Superv,sormode code ninmng on the SoPEC CPUs will allow or disallow these commands. The softw« 
protocol needs to be constructed with this in mind. ne sonware 

^sTavL"''"*'*"''"'' " ISIMaster to initiate aU communication with the 

SoPEC operation is broken iq> into a number of sections which axe outlined below. 

10.3.1 Powerup 

Powenip describes SoPEC initialisation foUowing an external reset or the watchdog timer system reset 

1) Execute reset sequence for complete SoPEC. 

2) CPU boot from ROM. 

3) Basic configuraHon of CPU peripherals. SCB and DIU. DRAM initialisation USB Wakeup 

4) SoPEC identification by activity on USB end-points 2-4 indicates it is the ISIMaster. 

5) Download and authentication of program (see Section 1 0.5.3). 

6) Store reusable cryptogr^c results in Power-Safe Storage (PSS). 

7) Execution of program from DRAM. 

8) Retrieve operating parameters from PRINTER^QA and authenticate operating parameters. 

9) Download and authenticate any further datasets (programs). 

10) The initial dataset may be broadcast to all the ISISIaves. 

' ^^SKkTe's^Ss ^^^^ * '° ^ authentication to take place on the 

12) Each ISISlave SoPEC is polled for the result of its program code authenrication process. 

13) If all IflSlwes report successful authentication the OEM code module can be distributed and 
authenticated. OEM could will most likely reside on one SoPEC. 

10.3.2 USB wakeup 

?o™\1^v''thl??irc"l'"?"' °?f '''^P ^'^e « ^ CPR block 

lL'ig?(?Sj"J!r£rbe"^C""^ 
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Wakeup describes SoPEC recovery irom sleep mode with the SCB and power-safe storage (PSS) still 
enabled For an ISIMaster SoPEC, wakeup can be initiated following a USB reset from the SCO. 
A typical USB wakeup sequence is: 

1 ) Execute reset sequence for sections of SoPEC in sleep mode. 

2) CPU boot from ROM, if CPU-subsystem was in sleep mode. 

3) Basic configuration of CPU peripherals and DIU, and DRAM initialisation, if required. 

4) SoPEC identification by activity on USB end-points 2-4 indicates it is the ISIMaster. 

5) J^^'^<>ad and authentication of program using results in Power-Safe Storage (PSS) (see Section 

6) Execution of program from DRAM. 

7) Retrieve operating parameters from PRI^^1ER_QA and authentiw^^ 

8) Di^load and authenticate any fiirther datasets (programs) using results in Power-Safe Storage 
(PSS) (see Section 10.5.3). ^ 

9) Following steps as per Powen^. 



10.3.3 Print initialization 



This sequence is typically performed at the start of a print job following powerup or wakeup: 

1) Check amount of ink remaining via QA chips which may be present on a ISISlave SoPEC. 

2) Download static data e.g. dither matrices, dead nozzle tables from Host to DRAM. 

3) Check printhead temperature, if required, and configure printhead with firing pulse profile etc 
accordingly. Instruct ISISlaves to also perform this operation, 

4) Initiate printhead pre-heal sequence, if required Instruct ISISlaves to also perform this operation 

10.3.4 First page download 

Buffer management in a SoPEC system is normally performed by the Host. 

1) The Host communicates to the SoPEC CPU over the USB to check that DRAM space remaining is 
sufficient to download the first band 

2) The Host downloads the first band (with the page header) to DRAM. 

3) When the complete page header has been downloaded the SoPEC CPU processes the page header, 
calculates PEP register commands and write direcUy to PEP registers or to DRAM. 

4) If PEP register conunands have been written to DRAM, execute PEP commands from DRAM via 
PCU. 

Poll ISISlaves for DRAM status and download compressed data to ISISlaves. 
Remaining first page bands download and processing: 

1) Check DRAM space remaining is sufficient to download the next band. 

2) Download the next band with the band header to DRAM. 

3) When the complete band header has been downloaded, process die band header according to 
w^chever band-related register updating mechanism is being used. 

Poll ISISlaves for DRAM status and download compressed data to ISISlaves. 

10.3.5 Start printing 

1) Wait until at least one band of the first page has been downloaded. 

2) Start all the PEP Units by writing to their Go registers, via PCU commands executed from DRAM 
or direct CPU writes, in die suggested order defined in Table 12. 

3) Print ready interrupt occurs (from PHI). Poll ISISlaves until print ready interrupt. 



Doc: SoPEC^hardware.design 
Version: 2.3 



S3 Proprietary Document 



^ Nov 2002 
Page 51 



SoPEC : Hardware Design 



4) Start motor control (which may be on an ISISlaves SoPEC), if first page, otherwise feed the next 
page. This step could occur before the print ready interrupt. 

5) Drive LEDS, monitor paper stanis (which may be on an ISISlaves SoPEC). 

6) Wait for page alignment via page sensoi<s) GPIO intemipt (which may be on an ISISlaves SoPEC) 

7) CPU instructs PHI to start producing master line syncs, or wait for an external device to produce 
line syncs. ^ 

8) Continue to download bands and process page and band headers for next page. 

10.3.6 Next page(s) download 

As for first page download, performed during printing of current page. 

10.3.7 Between bands 

When the finished band flags are asserted band related registers in the CDU, LBD and TE need to be re- 
programmed. This can be via PCU commands from DRAM. Typically only 3-5 commands per decom- 
pression wut need to be executed These registers can also be reprogrammed directly by the CPU or by 
updatmg from shadow registers. The finished band flag interrupts to the CPU, tell the CPU that the area of 
memory associated with the band is now free. 

10.3:8 During page print 

Topically during page printing ink usage is communicated to the QA chips. 

1) Calculate ink printed (from PHI). 

2) Decrement ink remaining (via QA chips). 

3) Check amount of ink remaining (via QA chips). This operation may be better performed while the 
page 15 bcmg pnnted rather than at the end of the page. 

10.3.9 Page finish 

These operations arc typically perfonned when the page is finished: 

1) Page finished interrupt occurs from PHI. Poll ISISlaves for page finished intemipts. 

2) Shutdown the PEP bl^^^ 3 
will set the PEP Umt state-machines to their startup states. 

3) Communicate ink usage to QA chips, if required. 

1 0.3.1 0 Start of next page 

These operations are typically performed before printing the next page: 

1) Re-program the PEP Units via PCU command processing from DRAM based on page header. 

2) Go to Start printing. 

1 0.3.1 1 End of docu ment 

I) Stop motor control. This may be on an ISISlave SoPEC. 

10.3.12 Powerdown 

In this mode SoPEC is no longer powered. 

1) Instruct Host PC via USB that SoPEC system is about to power down. 

2) Instruct ISISlave SoPECs to powerdown. 

3) Powerdown ISIMaster SoPEC, 
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10.3.13 Sleep 

The CPU can put different sections of SoPEC into sleep mode by writing to registers in the CPR block 
[16]. 

1) Instruct Host PC via USB which parts of SoPEC system are about to sleep. 

2) Put defined SoPECs into defined sleep modes. 
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1 0.4 Normal operation in a Multi-SoPEC System - isislave SoPEC 

This section the outline typical operation of an ISISlave SoPEC in a muIti-SoPEC system. The ISIMaster 
can be another SoPEC or an ISI-Bridge chip. Tlae ISISlave communicates with the hL S,^e ISIM^r 
Buffer management in a SoPEC system is noimally performed by the Host. 

10.4.1 Powerup 

Powerup describes SoPEC initialisation following an external reset or the watchdog timer system reset 
A typical poweiup sequence is: 

1 ) Execute reset sequence for complete SoPEC. 

2) CPU boot from ROM. 

. 3) Basic configuration of CPU peripherals, SCB and DIU. DRAM initialisation. 

4) Download and authentication of program (see Section 10.5.3). 

5) Store reusable cryptographic results in Power-Safe Storage (PSS). 

6) Execution of program from DRAM. 

7) Retrieve operating parameters from PRINTER^QA and authenticate operating parameters. 

8) SoPEC identification by sampling GPIO pins to determine ISIId. Communicate ISIId to ISIMaster. 

9) Download and authenticate any further dafasets, 

10.4.2 ISIwakeup 

The CPU can put different sections of SoPEC into sleep mode by writing to registens in the CPR block 
[16]. Nonndly the CPU sub-system and the DRAM will be put in sleep mode but the SCB and power^safe 
storage (PSS) wiU still be enabled ^ 

Wakeup describes SoPEC recovery from sleep mode with the SCB and power-safe storage (PSS) still 
enabled: In an ISISlave SoPEC, wakeup can be initiated following an ISI reset from the SCB 
A typical ISI wakeup sequence is: 

1) Execute reset sequence for sections of SoPEC in sleep mode. 

2) CPU boot from ROM, if CPU-subsystem was in sleep mode. 

3) Basic configuration of CPU peripherals and DIU, and DRAM initiaUsation, if required 
loTf)^^^ authentication of program using results in Power-Safe Storage (PSS) (see Section 

5) Execution of program from DRAM. 

6) Retrieve operating parameters from PRINTER.QA and authenticate operating parameters. 

7) SoPEC identification by sampling GPIO pins to determine ISIId Communicate ISild to ISIMaster. 

8) Download and authenticate any further db/osetr. 

10.4.3 Print fnitiafizatron 

This sequence is typically performed at the start of a print job following powenip or wakeup: 

1) Check amount of ink remaining via QA chips. 

2) Download static data e.g. dither matrices, dead nozzle tables from ISIMaster to DRAM. 

3) Check printhead temperature, if required, and configure printhead with firing pulse profile etc 
accordingly. *^ 

4) Initiate printhead pre-heat sequence, if required 
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10.4.4 First page download 

BuffCT management in a SoPEC system is nomiaUy pcrfonned by the Host via the ISIMaster 

1) Check DRAM space remaining is sufficient to download the first band 

2) The Host downloads the first band (with the page header) to DRAM via the ISIMastcr 

3) When the complete page header has been downloaded, process the page header, calculate PEP ree- 
ister commands and wnte directly to PEP registers or to DRAM. 

?CV^ '""^^^ commands have been written to DRAM, execute PEP commands from DRAM via 

Remaining first page bands download and processing: 

1 ) Check DRAM space remaining is sufficient to download the next band. 

2) The Host downloads the first band (with the page header) to DRAM via the ISIMaster 

3) When the complete band header has been downloaded, process the band header according to 
whichever band-related register updating mechanism is being used, 

10.4.5 Start printing 

1) Wait until at least one band of the first page has been downloaded. 

2) Start all the PEP Units by writing to their Go registers, via PCU commands executed from DRAM 
or direct CPU wntes, in the order defined in Table 12. i ^^umvi 

3) Print ready interrupt occurs (from PHI). Communicate to ISIMaster via ISI link. 

4) Start motor control, if attached to this ISISIave. when requested by ISIMaster, if first page, other- 
wise feed next page. This step could occur before the print ready interrupt 

5) Drive LEDS, monitor paper status, if on this ISISIave SoPEC, when requested by ISIMaster 

^ ISnif^ter^^ alignment via page sensor(s) GPIO intemipt, if on this ISISIave SoPEC. and send to 

7) Wait for line sync and commence printing. 

8) Continue to download bands and process page and band headers for next page. 

10.4.6 Next page(s) download 

As for fast band download, performed durii^g printing of curnsnt page. 

10.4.7 Between bands 

When the finished band flags are asserted band related registers in the CDU. LBD and TE need to be rc- 
pro^^ammed. This can be via PCU commands from DRAM. Typically only 3-5 commands per decom- 
prt^ion imit need to be executed. These registers can also be leprogrammed directly by the CPU or by 
updatmg from shadow registers. The finished band flag interrupts to the CPU tell the CPU that the area of 
memory associated with the band is now free. 

10.4-8 During page print 

Typically during page printing ink usage is communicated to the OA chips 

1 ) Calculate ink printed (from PHI). 

2) Decrement ink remaining (via QA chips). 

nir^ r^*""^ of ink remaining (via QA chips). This operation may be better perfonned while the 
page IS being pnnted rather than at the end of the page. 
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10.4.9 Page finish 



These operations are typically performed when the page is finished: 

1) Page finished intemipt occurs from PHI. Communicate page finished intemipt to ISIMaster 

2) Shutdown the PEP blocks by de-asserting their Go registers in the suggested order in Table 13 This 
will set the PEP Unit state-machines to their startup states. t^. - nis 

3) Conununicate ink usage to QA chips, if required. 

10.4.10 Start of next page 

These operations are typically performed before printing the next page: 

1) Rc-program the PEP Units via PCU command processing from DRAM based on page header 

2) Go to Start printing. 

1 0.4.1 1 End of document 

Stop motor control, if attached to this ISISlave. when requested by ISIMaster. 

10.4.12 Fowerdown 

In this mode SoPEC is no longer powered. 

1) Powcrdown ISISlave SoPEC when instructed by ISIMaster. 

10.4.13 Sleep 

The CPU can put different sections of SoPEC into sleep mode by writing to registers in the CPR block 
1) Put SoPEC into defined sleep modes. 
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10.5 Security Use Cases 

Please see the 'SoPEC Security Overview' [9] document for a more complete description of SoPEC secu- 
nty iwues. The SoPEC boot operation is described in the ROM chapter of the SoPEC hardware desien 
specification. Section 1 1.2. * 

10.5.1 Communication with the QA chips 

Communication between SoPEC and the QA chips (i.e. INK^QA and PR1NTER_QA) will'take place on 
at least a per power cycle and per page basis. Communication with the QA chips has three principal pur- 
poses: validating the presence of genuine QA chips (i.e the printer is using approved consumables) valida- 
tion of the amount of ink remaining in the cartridge and authenticating the operating paiameterj for the 
pnnter. After each page has been printed, SoPEC is expected to communicate the number of dots fired per 
mk plane to the QA chipset. SoPEC may also initiate decoy communications with the QA chips from time 
to time. 

Process: 

" ^^JlSlrl^'!^^ consumption SoPEC is expected to principaUy act as a conduit between the 
PRINTER^QA and INieQA chips and to take certain actions (basically enable or disable printing and 
report status to Host PC) based on the result The communication channels are insecure but all traffic is 
signed to guarantee authenticity. 

Known Weaknesses 

♦ All communication to the QA chips is over the LSS interfaces us^^^ 

This IS open to observation and so the communication protocol could be reverse engineered In this 
c^ both the PRINTER_QA and INK^QA chips could be replaced by impostor devices (e.g. a single 
FPGA) that successftdly emulated the communication protocol. As this would require physical modifi. 
canon of each printer this is considered to be an acceptably low risk. Any messages that are not signed 
by one of the symmetric keys (such as the SoPEC Jd^key) could be reverse engineered. The imposter 
device must also have access to the appropriate keys to crack the system. 

• If the secret keys in die QA chips are exposed or cradced then the system, or parts of it. is compro- 
mised. 

Assumptions: 

[1] Th« QA chips are not involved in the authentication ofdownioadcd SoPEC code 
[2 ] The QA chip in the ink cartridge (INK.QA) does not direcdy affect the operation of the cartridge in 
any way i.e. it does not inhibit the flow of ink etc. 

'^?xn7!^^ ^NK.QA and PRINTER^QA chips are identical in their virgin state. They only become a 
INieQA or PRINTER^QA after their FlashROM has been programmed. 



10.5.2 Authentication of downloaded code in a single SoPEC system 
Process: 

1) SoPEC identification by activity on USB end-points 2-4 indicates it is the ISIMaster. 

2) The program is downloaded to the embedded DRAM. 

3) the CPU calculates a SHA-1 hash digest of the downloaded program. 

4) Ih^ ResetSrc register in the CPR block is read to determine whether or not a power on reset 
occurred. 

5) If a power-on reset occurred the signature of the downloaded code (which needs to be in a known 
location such as the first or last N bytes of the downloaded code) is decrypted using the Silverbrook 
public bootOkey stored in ROM. This decrypted signature is the expected SHA-! hash of the 

accompanying program. The encryption algorithm is likely to be a public key algorithm such as 
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RSA. If a power-on reset did not occur then the expected SHA-1 hash is retrieved from the PSS and 
die compute intensive decryption is not required 

SSy^SSn ^riS"''*^ ""^"^ compared and if they match then the programs authen- 

^ ^I^'i^'l'^"!ff°"°^'^*^'^'^^'^*«"°^PCisnotifiedofthefailur^ 

to put the SoPEC device into powenlown mode. uccmc 

8) If the hash vahies match then the CPU starts executing the downloaded program 

nPM ^ 77- P«>8^ wishes to download subsequent programs (such as 

OEM code) It IS responsible for ensuring the authenticity of everything it downlo^. The down- 

t^S^^ ^""l!" P"''".'' ^^^^ a» to authenticate subsequent downloads, thus 
fomung a hierarchy of authentication. The SoPEC ROM does not control these authenricatio,^ - it 
is solely concerned with verifying that the first program downloaded has come from a trusted 

SO 06 * 

^°^A!c?"lt"?n^"^°' ^"J ^ executing. TTie SUverbrook supervisor code acts as an 

^'^t''s^e;^r:orc?de"''"'^^^^ 

I DTlje OEM code is expected to perfonn some simple 'turn on the lights' tasks after which the Host 
PC IS infomied that the pnnter is ready to print and the Start Printing use case comes into play. 
Known Weaknesses: 

* lis ^"^t.*"°'' P"^^'f bootOkey is exposed or cracked then the system is seriously compromised A 
ROM mask change would be required to reprogram the bootOkey. ^ u.«. 

10.5.3 Authentication of downloaded code in a multl-SoPEC system 

10.5.3.1 tSiMastar SoPEC Pmcess: 

1) SoPEC identification by activity on USB end-points 2-4 indicates it is the ISIMaster. 

2) The SCB is configured to broadcast the data received ftom die Host PC. 

^llT^^ downloaded to the embedded DRAM and broadcasted to all ISISlave SoPECs over 
4) The CPU calculates a SHA-1 hash digest of the downloaded program. 

JJlLiS!'*^ ^ ^ ^ *° <letermine whether or not a power-on reset 

6) If a power-on reset occuired the signature of the downloaded code (which needs to be in a known 
location such as the first or last N bytes of the downloaded code) is decrypted using the Silverbrook 
pubhc bootOkey stored in ROM. This decrypted signature is the expected SHA-1 hash of the 
accompanying program. The encryption algorithm is Ukely to be a public key algorithm such as 

. RSA. If a power-on reset did not occur then the expected SHA-1 hash is retrieved from die PSS and 
the compute mtensive decryption is not required. 

7) The calculated and expected hash values are compared and if they match then the programs authen- 
ticity has been verified. 

8) If the hash values do not match then die Host PC is notified of the failure and software may decide 
to put the SoPEC device into powerdown mode. 

9) If die hash values match then the CPU starts executing the downloaded program. 

10) It is likely that die downloaded program will poll each ISISlave SoPEC for the result of its authenti- 
cation process and to determine the number of slaves present 

1 l)lf any slave reports a failed authentication dicn the ISIMaster communicates this to dieHost PC and 
puts itself into /'owerdbivn mode. 
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^^^i^oSlS^-^" i^ort successful authentication then the downloaded program is responsible for the 
dwvnloadmg. authentication and distribution of subsequent programs within the nndti-SoPEC sys- 

^^^oJsTttrnptS"*"* SUverbrook supervisor code acts as an 

Sir^TeSutlrorcSe"''^'^^ 

'""^Spfp If r*'^- ^T^'^'J' ^"P''' °« *e tasks after which the master 

SoPEC deteimmes th^ all SoPECs are ready to print. The Host PC is informed that the printe b 
ready to pnnt and the S/arr/Wnrthg use case comes into play. p « i» 



10.5.3.2 ISISIave SoPEC PmcBSs: 

1) When the CPU comes out of reset the SCB should still be in slave mode, and the SCB is already 
configured to receive data from the ISlMaster. «"»«=<iuy 

2) The program is downloaded to embedded DRAM. 

3) The CPU calculates a SHA-1 hash digest of the downloaded program 

5) If a power-on reset occurred the signature of the downloaded code (which needs to be in a known 
locahon such as the first or last N bytes of the downloaded code) is decrypted using the Silverbrook 
public bootOkey stored in ROM. TTiis decrypted signature is the^cted ShX-1 Stf tSe 
^r^!^^ program The encryption algorithm is likely to be a public key algorithm such as 
RSA. If a power-on re^t did not occur then the expected SHA-l hash is retrieved f^m the PSS and 
the compute mtensive decryption is not required. 

7) If the hash values do not match, then the ISISIave device will await a new program again, eventu- 
ally tinung out and powering down. • 

8) If the hash values match then the CPU starts executing the downloaded program 

th?iSjS^*-Sf ^^'^fP^e^ communicate the result of its authentication process to 
tue ISlMaster. The downloaded program is responsible for determining the SoPECs ISIId. receivine 
and authenticatmg any subsequent programs. reccivmg 

0«Ta"S^S^ OEM code staru executing. The SUverbrook supervisor code acts as an 

^ *^?PB?.^^ irr^^ '? ^^P'** °° *^ «^ after which the master 

SoPEC IS informed that this slave is ready to print The Start Printingme case then comes into play. 

Known Weaknesses 

■ li^}^ SilveArook private bootOkey is exposed or cracked then the system is seriously compromised. 
• IS an open mterface i.e. messages sent over the ISI are in the clear. The communication channels 
are msecure but all traffic is signed to guarantee authenticity. As all communication over the ISI is^on! 
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1 0.5.4 Authentication and upgrade of operating parameters for a printer 

nie SoPEC IC will be used in a range of printers with different capabilities (e.g. A3/A4 printing, printing 
speedy resolution etc.). It is expected that some primers will also have a software upgrade capability which 
wou^d allow a user to purchase a license diat enables an upgrade in their printer's capabUities (such as 

pZi^p L u *° '^^"^^ly «P«"«i«g parameters in the 

PRXNTER^QA chip, to securely communicate these parameters to the SoPEC and to securely reprogram 
the parameters in the event of an upgrade. Note that each printing SoPEC (as opposed to a SoPEC t^t is 
lil^ Z °^ ^""^ PRINTER^QA chip (or at least access to a 

c^SJ^? ^ '^T^ SoPEC_id_key). Therefore both ISIMaster and ISISlave 

boPECs will need to authenticate operating parameters. 
Process: 

1) Program code is downloaded and authenticated as described in sections 10.5.2 and 10.5.3 above. 

2) The program code has a function to create the SoPEC_id_lcey from the unique SoPEC Jd that was 
programmed when the SoPEC was manufactured. 

retrieves the signed operating parameters from its PRINTER_QA chip The 
PRINTER^QA chip uses the SoPEC_id_key (which is stored as part of the pairing process exe- 
cuted during pnnthead assembly manufacture & test) to sign the operating parameters which are 
appended with a random number to thwart replay attacks. 

4) The SoPEC checks the signahue of the operating parameters using its SoPEC id_key. If this signa- 
ture authentication process is successful then the operating parameters are considered valid and the 
overall boot process continues. If not the error is rqjorted to the Host PC. 

5) Operating parameters may also be set or upgraded using a second key, the PrintEngineLicense key 
which IS stored on the PRINTER.QA and used to authenticate the change in operating param«ers 

Known Weaknesses: 

- It may be possible to retrieve the unique SoPEC jd by placing the SoPEC in test mode and scanning it 
out. It IS certainly possible to obtain it by reverse engineering the device. Either way the SoPEC id 
(and by extension the SoPEC_id_kcy) so obtained is valid only for that specific SoPEC and so printers 
inay only be compromised one at a time by parties with the appropriate specialised equipment Fur- 
thennore even if the SoPECJd is compromised, the other keys in flie qrstem, which protect the 
authentication of consumables and of program code, are unaffected. 
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10.6 Miscellaneous Use Cases 

SS ^ ^*'"''^'"« *^Pl"- Software running on the SoPEC 

CPU or Host will decide on what actions to take in these scenarios. 

1 0,6.1 Disconnect / Reconnect of QA chips. 

1) Disconnect of a QA chip between documents or if ink runs out mid-document 

to'^r.'inH ^t?^^ !!!f «•«• ^ 'partridge replacement should allow the system 

to resume and print the next document 

10-6.2 Page arrives before print ready interrupt. 

1) Engage clutch to stop paper until print ready intcrmpt occurs, 

10.6.3 Oead-nozzle tabie upgrade 

1) Run pnnthead nozzle test sequence 

2) Either Host or SoPEC CPU converts dead nozzle information into dead nozzle table. 

3) Store dead nozzle table on Host. 

4) Write dead nozzle table to SoPEC DRAM. 
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10.7 Failure Mode Use Cases 

10.7.1 System errors and security violations 



SS^rmP "^u ^'^X'^^^f'!^ ^ ^Poned to the SoPEC CPU and Host. Software running on the 
ZiOFhC CPU or Host will then decide what actions to take. 



Silverbrook code authentication failure. 

1) Notify Host PC of authentication failure. 

2) Abort print run. 

•OEM code authentication failure. 

1 ) Notify Host PC of authentication failure. 

2) Abort print run. 
Invalid QA chip(s). 

1) Report to Host PC. 

2) Abort print run. 

MMU security violation intenupt * 

1) This is handled by exception handler. 

2) Report to Host PC 

3) Abort print run. 

Invalid address interrupt from PCU. 

1) This is handled by exception handler. 

2) Report to Host PC. 

3) Abort print run. 
Watchdog timer interrupt. 

1) This is handled by exception handler. 

2) Report to Host PC. 

3) Abort print run. 

Host PC does not acknowledge message that SoPEC is about to power down, 
1) Power down anyway. 

10.7.2 Printing errors 

Printing erroi^ are reported to the SoPEC CPU and Host Software running on the Host or SoPEC CPU 
will then decide what acrions to take. 



Insufficient space available in SoPEC compressed band-store to download a band 
1) Report to the Host PC. 

Insufficient ink to print. 

1) Report to Host PC. 
Page not downloaded in time while printing. 

1) Buffer undemin intenupt will occur. 

2) Report to Host PC and abort print run. 

JPEG decoder error intemipt. 
1) Report to Host PC. 
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11 Central Processing Unit (CPU) 



11.1 Overview 



The CPU block consists of the CPU core, MMU. cache and associated logic. The principal tasks for the 
program ninnmg on the CPU to fulfill in the system are: '= pnnrapai lasKS tor the 

Communications: 

• Control the flow of data from the USB interface to the DRAM and ISI 

• Communication with the host via USB or ISI 

• Running the USB device driver 

PEP Subsystem Control: 

• Page and band header processing (may possibly be perfonned on host PC) 

• Configure printing options on a per band, per page, per job or per power cycle basis 

• Initiate page pnnting operation b the PEP subsystem 

• Retrieve dead nozzle information from the printhead interface (PHI) and forward to the host PC 

' SStSsScs""'"'''' ''"'^^ " °^ """^^^"^ °- P'^^^^i 

• Retrieve printhead temperature via the PHI 
Security: 

• Authenticate downloaded program code and printer operating parameters 

• Authenticate consumables via the PRINTER^QA and INK-QA chips 
^ Monitor ink usage 

• Isolation of OEM code from direct access to the system resources 
Other: 

• J>rive the printer motors using the GPIO pins 

• Monitoring the status of the printer 0>apcr jam, tray empty etc.) 

• Driving front panel LEDs 

• Perform post-boot initialisation of the SoPEC device 

• Memory management Oikely to be in conjunction with the host PC) 

• Miscellaneous housekeeping tasks 

Z^toTltbi^L^^^ S'"' ""^^ if "^"'"^ '° P"^*''' ^ of perfomiance at least equiva- 
Sio^ cSuSSlf ■ ^ ^ ^determined amoL of 

SXT^ft. I/rf '° P*"^""" other tasks. The extra performance required is dom- 

inated by the signature verification task and the SCB (including the USB) management task An ooa^tinB 

r^JLTnT"" """''^^ evaluated anrirLE^ Yl^S"! 

considered to be the most appropriate solution. A diagram of the CPU block is shown in Figure 1 5 below 
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AHB Controller 



AHB Interface 



LEON Core 



CACHE 
&MMU 



Address 
Decx3der 



Realtime 

Debug 

Unit 



cpu_adrf21:01 
cpu_dataoiit(31:0] 

dram^cpu_datar255,-01 
cpu_3iu_rreq 

diu_cpu_rack 

diu_cpu_rvalid 

CTu_diu_wreq 

aiu_cpu_wa<* 

cpu_dig_wvaljd 

cpu_dtu_wmask(1 :0] 

qxj_acode[1:0] 

cpu nwn 

cpu_cpr_sel 

cpf-Cpu_rdy 

cpLcpu_data(31:0] 

cpu_oplo_sel 

flpio_cpu_fdy 

gpio_cpu_data{31:01 

pP"Jcu_sel 

»cu_cpu_rdy 

Iss_cpu_datar31:01 
cpu_pcu sel 
pctiZcpulrdy 
pc«-Cpu_data(31 .-01 
CPU scb_sol 
sc6_cpu_rdy 
scb_cpu_dataf31 :0] 
cpu_tim_sd 
tim_cpu_rdy 
tim_cpu_data(31 :0] 
cpu_rom_sel 
r6m_cpu_ray 
rom_cpu_daiaI31 :0] 
cpu_pas_8el 
paa_cpu_rdy 
pss_cpu_data(31 :0] 

§)u_dlu_sel 
u_cpu_rdy 

dUj.cpu_data[31K)] 

diu^cpujjcrr 

pss_cpu_berr 

rom_cpu_berr 

iim_cpu_berr 

scb_cpu_bOfr 

pcu_cpu.berr 

lss_cpu_berT 

lcu_cpu_berT 

Qpk)_cpu_befr 

cpr_cpu_befr 

dtu^cpu.debugLvaJid 

tini_cpu_detxjg_valid 

scb_cpu_debug_vaUd 

PCU-.cpu_debugLvaJId 

'S3_cpu_debua_valid • 

icu_cpu_debug_vand 

OPio.cpu.debuguvalld 

cpr_cpu_debua_valid 



debug_data_oul[1 8:0) 

debug_data.valid 

debug_cntr1(19:0] 



Figure 15. CPU block diagram 



prst_n 
pclk 

icu_cpu_nevel[3:0) 
cpujack 

cpujcujl«vell3:0] 
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I 11.2 Definitions OF l/Os 



I 



Table 14. CPU Subsystem l/Os 















Clocks and Resets 






prst_a 


1 




Global reset. Synchronous to pdk. active low. 




pcfk 


1 


1 


Global clock 




CPU to DIU DRAM Interface 


1 


cpu_adr[21:0] 


22 


Out 


Address bus for tx)th DRAM and peripheral access 


1 


cpu.dataoiit(31:0] 


32 


Out 


Data out to both DRAM and peripheral devices. This should be 
driven at the same time as the cpiLa</r and request signals. 


1 


d ram_cpu_data(255:0] 


256 


In 


Read data from the DRAM 




cpu.d{u_rreq 




Out 


Read request to the OIU DRAM 




diu_cpu_mck 




In 


Acknowledge from DtU that read request has been accepted 


f 


diu_cpu_fvafid 




In 


Signal from DIU teUing SoPEC Unit that vafid read data is on the 
dramjcpujdata bus 




cpu_diu_wreq 




Out 


Write request to the OIU 




dKi.cpu.wack 




In 


Acknowledge from the DIU that the write request has been 
accepted 




cpu_jdiu_wvalid 


1 


Out 


Signal from the CPU to the DIU fndteating that the data currently on 
the cp</.da£ao(if bus Is valid 


1 


cpu_diu_wmask[1 :0] 


2 


Out 


Rag indicating format of CPU write to ORAM 
cpu^d^ujmrmsk a 00: 8-blt write 
cpu^dfu^wmask^ 01 : l6-bit write 
cpei.d!fu_wmaslrs 10: 32-bit write 
cpujtSki_wmask^ 1 1 : reserved 

cpu_adrl2.'0) are driven in accordance %vith the wklth of the data 
access rndfoated by cpu_diU.M7nasJir. Addresses cannot cross a 
256-bit word DRAM boundary. 




CPU to peripheral blocks 




cpu_fwn 


1 


Out 


Common read/not-write signal from the CPU 


1 

1 


cpu_acode(1K)] 


2 


Out 


CPU access code signals. 

cpu^acodelOJ - Program (0) / Data (1) access 

cpu.acode[1] - User (0) / Supervisor (1} access ' 




cpu.cpr.sel 


1 


Out 


CPR bfock select 




cpr_cpu_rdy 


1 


in 


Ready signal to the CPU. When cpr_cpu^niy \^ high it indicates the 
last cyde of the access. For a write cyde this means cpu.d^taouf 
has been registered by the CPR bfock and for a read cyde this 
nrteans the data on cprjcpujiiata Is valid. 




cpr_cpujberr 


1 


In 


CPR bus en-or signal to the CPU. 




cpr_cpu_data{31 .OJ 


32 


In 


Read data bus from the CPR bfock 




cpu^plo.sel 


1 


Out 


QPiO bfock select 




9pio_cpu.rdy 


1 


In 


QPIO ready signal to the CPU. 




gpio_cpu_befr 


1 


In 


GPIO bus error signal to the CPU. 




gpio_cpu_data(31 :0] 


32 


In 


Read data bus from the GPIO block 




cpujcu_sel 


1 


Out 


ICU block select 




fcu_cpu_rdy 


1 


In 


ICU ready signal to the CPU. 




icu_cpu_berr 


1 


in 


ICU bus error signal to the CPU. 




teujcpu_data(31 ;0] 


32 


In 


Read data bus from fhe ICU block 
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cpujss_sel 




Out 


LSS t)tock select ' 


Iss_cpu_rdy 


1 


rn 


LSS ready signal to the CPU 


lss_cpu_berr 


1 


In 


LSS bus error signal to the CPU. 


lss_cpu_data(31:0] 


32 


In 


Read data bus from the LSS block 


cpu_pcu_sel 


1 


Oul 


PCU block select 


pcu.cpu.rdy 


1 


In 


PCU ready signal to the CPU. 


pcu.cpu.berr 


1 


In 


PCU bus error signal to ttie CPU. 


pcu_cpu_dataC31:0J 


32 


In 


Read data bus from the PCU t>lock 


cpu.scb^sel 


1 


Out 


SCB block select 


6cb_cpu_rcJy 


1 


In 


SCB ready signal to the CPU. 


scb_cpu_berr 


1 


In 


SCB bus error signal to the CPU. 


scb_cpu_data(31 .-0] 


32 


In 


Read data bus from the SCB block 


cpu.tim_8el 




Out 


Timers block select. 


tim_cpu_rdy 




In 


Timers block ready signal to the CPU 


tsmjcpujberr 




In 


Timers bus error signal to the CPU. 


tim_cpu_datai31 .-0] 


32 


In 


Read data bus from the Timers block 


cpu_rom_s€l 




Out 


ROM block select 


iom_cpu_rdy 




In 


ROM bk>ck ready signal to the CPU. 


iom_cpu_berr 




In 


ROM bus error signal to the CPU. 


fom_cpu_data(31 K3I 




in 


Read data bus from the ROM block 


cpuj)S8.$el 




Out 


PSS bk>ck select 


pss.cpu^fdy 




In 


PSS block ready signal to the CPU. 


pS3_cpu_berr 




In 


PSS bus error signal to the CPU. 


pss_cpu.data(31 :0] 




In 


Read data bus from the PSS btock 


cpu_diu_8el 




Out 


OIU register bk>ck select. 


diu_cpii_fdy 




In 


piU register block ready stgnal to the CPU. 


<ifu_cpu_berr 




In 


DIU bus error signal to the CPU. 


diu_cpu_data{3l X)] 


32 


In 


Read data bus from the DIU block 


Interrupt signals ' 


*cU-CpuJlevel(3K)) 


3 


In 


An interrupt Is asserted by driving the appropriate priority level on 
icu^cpu_ltMK These signals must remain asserted until the CPU 
executes an Intenxipt acknowledge cyde. 


cpu_jcuJleveJ[3:0J 


3 


Out 


Indicates the level of the Interrupt the CPU Is acknowledging when 

cpLf_j(adiflshigh 


cpujack 


1 


Out 


Interrupt ackrKwledge signal. The exact timing depends on the 
CPU core Implementation 


Debug signals 




dlu_cpu_debug_valjd 


1 


In 


Signal indicating the data on the diu cpu data bus is valid debug 
data. 


tim_cpu_debug_valid 


1 


In 


Signal Indicating the data on the tfm cpu data bus is valid debug 
data. 


scb_cpu_da buo_vaIid 


1 


In 


Signal indteating the data on the sct^cpu data bus is vaBd debuo 
data. • 
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Table 14. CPU Subsystem l/Os 





^^^^^ 
miimSm 








1 


m 


bignaJ indicating the data on the pGu_cpu_(fata bus ia valid debua 
data. • 


(ss.cpu.debuQLvalid 


1 


In 


Signal Indicating the data on the fss_cpu^<iata bus is vafid debua 
data. 


teu_cpu_debug_valld 


1 


In 


Signal indicating the data on the icu_cpu^data bus is valid debug 
data. 


Opio_cpu_debug_valid 


1 


In 


Signal indicating the data on the gpio_cpu data bus Is valid debug 
data. 


cpr_cpu_debug_valid 


1 


In 


Signal indicating the data on the cpf^q)u_data bus is valid debug 
datsL 


debug_data_oui 


18 


Out 


Output debug data to be muxed on to the PHI pins 


deinjg_dat^vaJid 


1 


Out 


Debug vand signal indfcatlng the validity of the data on 
dobugLdata_out. This signal Is used in an debug configurations 


debug_cntrl 


20 


Out 


Control signal for each PHI bound debug data fine indicating 
whether or not the debug data should be selected by the pin mux 



11.3 Realtime REQUIREMENTS 

The SoPEC realtime requirements have yet to be fully deteimiaed but they may be split into three cateco- 
nes: haid, nnn and soft ° 

11.3.1 Hard realtime requirements 

Hard rcquiremeats are tasks that must be completed before a certain deadline or failure to do so will i«sult 
m^^ error perceptible to the user (printing stops or functions incorrectly). There are three hard realtime 

" "The motors which feed the paper through the printer at a constant speed during 

pnnung are dnven directly by the SoPEC device. Four periodic signals with different phase rela- 
tionships need to be generated to ensure the paper travels smoothly through the printer. The genera- 
Oon of these signals is handled by die GPIO hardware (see section .13.2 for more details) but the 
CPU IS responsible for enabling these signak (i.e. to start or stop the motors) and coordinatinE the 
movement of the pq)er with the printing operation of the prindiead. 

* management: Data enters the SoPEC via die SCB at an uneven rate and is consumed by die 
PEP subsystem at a different rate. The CPU is responsible for managing the DRAM bufTeis to 
ensure that neither overrun nor undcnun occur. This buffer management is likely to be performed 
under the direction of the host. ' H«iwnn«. 

• Band processing: In certain cases PEP registers may need to be updated between bands. As the tim- 
ing requirements are most likely too stringent to be met by direct CPU writes to the PCU a more 
hlrely scenario is that a set of shadow regUters will programmed in the compressed page units 
before the current band is finished, copied to band related registers by the finished band signals and 
the processmg of the next band will continue immediately. An alternative solution is that the CPU 
wiU construct a DRAM based set of commands (see section 21 .8.5 for more details) that can be exe- 
cuted by Ae PCU. The task for the CPU here is to parse the band headers stored in DRAM and gcn- 
Mate a DRAM based set of commands for the next number of bands. The location of the DRAM 
based set of commands must then be written to the PCU before the cuirent band has been processed 
by the PEP siAsystem. It is also conceivable (but currentiy considered unlikely) that the host PC 
could create the DRAM based commands. In this case the CPU wiU only be required to point the 
PCU to the correct location in DRAM to execute commands ftom. 
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11.3^ Rrm requirements 

Finn requirements are tasks that should be completed by a certain time or failure to do so will result in a 
degradanon of performance but not an error. The majority of the CPU tasks for SoPEC fall bto this cate- 
goiy including all inteiactions with the QA chips, program authentication, page feeding, configuring PEP 
registers for a page or job, determining the firing pulse profUe. communication of printer status to the host 
ov« Ae USB and the monitoring of ink usage. The authentication of downloaded programs and messages 
will be the most compute mtensive operation the CPU will be required to perform. Initial investigations 
mdicate that the LEON processor, naming at 160 MHz, will easily perform three authentications in under 
a second. 

Table 15. Expected firm requirements 





n^er^n lo sian OT priming nrst page [USB and slave SoPEC enumeration, 3 or more 

fijgnature verifications, code and compressed page data download and chip Inftiaii- 
salionj 


- 8 sees 7? 


Wake-up from sleep mode to start printing 13 or more SHA-I operations, code and com- 
pressed page data download and chip reinitialisation 


*2 sees 


Authenticate ink usage In the printer 


-0.5 sees 


Oetemfilning firing pulse profile 


- 0.1 sees 


Page feeding, gap between pages 


OEM dependent 


Communication of printer status to host PC 


~ 10 ms 


Configuring PEP registers 


7? 



11.3.3 



Soft requirements 

Soft requirements ate tasks that need to be done but there are only light time constraints on when they need 

^oppj^^pr]^"*" '^'^ ^^"^ ^^"^ P«><"°« higher priority tasks. As the 

SoPEC CPU IS expected to be lightly loaded these tasks will mostly be executed soon after tttey are sched- 



11.4 Bus Protocols 

As can be seen fromFigure 15 above there are different buses in the CPU block and different protocob are 
used for each bus. There are three buses in operation: 

11.4.1 CPU core to cache/MMU bus 

This is Ac native bus of the CPU core. See section 1 1 .6.6. 1 for more details. Timing and ftiU signal details 
snouia be provided m the documentation accompanying this core. 

11.4.2 Cache/MMU to DIU bus 

This bus confonns to the DIU bus protocol described in Section 20.13.2. Note that the address and data 
^t!^n^''u- PeriphwU bus. Tbt effective bus width differs between a itsad (256 bits) and a 

7^1 - ^H^ ^'^ ^2 ^'"^ °^ shared with the peripheral bus. As certain 

n ^/^^ may require byte write access this will need to be supported in the DIU. See section 
1 1 .0.0.2 tor more details. 
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11 A3 CPU Subsystem Bus 

n^^tT^M" "r'^" Pffiph^^als a simple bus protocol is used. n.e MMU must fiist determine which 
pamculai block is bcmg addressed (and that the access is a valid one) so that the appropria^I Ilect 
signal can be generated. During a write access CPU write data is driven out with £e w^l 

signal mdicanng tha^ ,t has registered the write data and the access can compete TlMTwriTi^b^ 
common to all penphemls and is also used for CPU writes to the embedded DR/^ aLT^c^ is^t 

rS-^l^lfbt ' """""^ pcmt-to-point data bus for read accesses to avoid the need for 

T^f '"t.^^To^ ^"PP^"* 1^'''' may be added if required by an 

able length. In most cases accesses will complete in two cycles but three or four (or mere) cycles access 
Z T ^'^'^'^r "^'^ ^ native bus interface. All PEP bloS^ aS^ Sa 

^^^m HB l^TJ!" " ^"^"^ " PCU is executing com- 

Jl^ ,iS^fn TJ"^ * ''^ "'^^'^ completed. This could lead to the 

SnH ^ I ^ K ^^^l" " '"^'"P^ *° ""^^ ""-^l" th« PCU is executing a com! 
mani The size and probability of this penalty is sufficienUy small to have any significant imiS'n^^- 

fi^^ « ^ ^ (cpM_aco*/7.0;). These signals indicate the type of address Lee 

0.e.User/SupeivisorandProgr^ata)beingaccessedbytheCPUforeachaccess.ES^^ 

wi^rSp^O ^''^T.T 8n«ted access to its re^r^dt 

wict Jr2pm,^T '^^^l^ permissions can apply to different registers within 

«LrT«;,I if / ? r ? t*'^^ violation is flagged by asserting the block's bus 

Z^ertS^^^S!;^ - ^"^r?" '^"S ^^Snal (W<,c*_cpu_nW which remai^ 

S '^""'^ return 0 and write access^ should have nc 

'^"^'^ "^^^ peripheral bus protocol in action. A write to the LSS block from 
S b^^ ^'TfS; fr^r^S " ^"'^'r^^'y --P'«««^ ™* - immediately followed by a read from" 
k temTld JJ^ ?f ^ Of '^^'^^ « not permitted the access 

« tenmnated with a bus error. The bus error exception processing then starts directly after this - no ftuthw 
accesses to the penpheml should be required as the exception handler should be WtSiL tSe 



Doc: SoPEC_hardware.design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 70 



SoPEC : Hardware Design 



pclk 



cpu_adr[21:01 b^^^LSS address | PEP address [^^^^^ Supervisor stadj 



cpu_rwn 



cpu_acode[1:0] h^^^ Supvr Data | User Data |^^^^ Supvr Data 
cpujss_sel _ 
lss_cpu_rdy 



J 



1 



lss_cpu_berr 



cpu.dalaoutI31:0] LSSdata k\^^y^N^^X^^?^ ?^::^^ 



cpu_pcu_sel 

pcu_cpu_berr 
pcu_cpu«rdy 



1 



pcu.cpu.data[31:0] 



0x0000^0000 



Figure 16. CPU bus transactions 



f f -4.3. i CPU subsystem bus s/aire state machine 

CPU subsystem bus slave operation is described by the state machine in Figure 17. This state machine 
will be implemented in each CPU subsystem bus slave. The only new signals mentioned here arc the 
valid^access and reg^available signals. The valid__access is deteraiined by comparing the cpujacode 
value with the block or register (in the case of a block that allow user access on a per register basis such as 
the GPIO block) access permissions and asserting valid^access if the permissions agree with the CPU 
mode. The reg^available signal is only required in the PCU or in blocks that are not capable of two-cycle 
access (e.g. blocks containing imported IP with different bus protocols). In these blocks the reg_available 
signal is an internal signal used to insert wait states (by delaying the assertion of block^cpu^rdy) until the 
CPU bus slave interface can gain access to the register. 

When reading from a register that is less than 32 bits wide the CPU susystems bus slave should return 
zeroes on the unused upper bits of the block_cpu^data bus. 

To support debug mode the contents of the register selected for debug observation, debug^reg, are always 
output on the block_epu_data bus whenever a read access is not taking place. See section 1 1.8 for more 
details of debug operation. 
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_ CPU biOftk ^\^Q 

biock.cpu.data « reg^data 
blocK.cpu_debug_valid 



prst Q 

Woc*ucpu_befr «=> 0 
blocK-Cpu.data = debmLreadata 
b(odecpu_debug.vafid « 1 



btodc.cpu. 



block. cpu_debug_vall 




Invalid Write> 

Access ybtocK-cpiuberroO 



Figure 17, State machine for a CPU subsystem slave 

11.5 LEON CPU 

The LEON processor is an open-source implementation of the IEEE-1754 standard (SPARC V8) instruc- 
tion set. LEON IS available from and actively supported by Gaisler Research (www.gaisier.com). 
The following features of the LEON-2 processor will be utilised on SoPEC: 

• IEEE-1754 (SPARC V8) compatible integer unit with 5-stage pipeline 

• Separate instruction and data cache (Harvard architecurc) 

• Set^associative caches: 1-4 sets. 1-64 kbyte/set. Random, LRR or LRU replacement. Direct 
mapped cacches are also available and axe the more likely option for SoPEC. 

• Full implementation of AMBA-2.0 AHB on-chip bus 

• Power-down mode 

The of LEON incorporates a number of peripherals and support blocks which will not be 
mcluded on SoPEC The LEON core as used on SoPEC will consist of: 1) the LEON integer unit. 2) pos- 
^'^^^ instruction and data caches (cunentfy under review), 3) the cache control logic (to be si^fi- 
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cantly reduced by optimisation if the caches are not used), 4) the AHB interface and 5) possibly the AHB 
controller (although this functionality may be implemented in the LEON Bridge), 

The version of the LEON database that the SoPEC LEON components will be sourced from is LE0N2. 
1 .0^8 although later versions may be used if they offer worthwhile functionality or bug fixes that affect the 
SoPEC design. Note that if the LEON caches are not used then we may revert to vl .0.7 of the database as 
the cache control logic is likely to be simpler and easier to optimise away (vl.0.8 introduced suonort for 
set-associative caching) i^pumui 

^J^?^.?,^"' ''^''''^'''^ "^'"^ ^^^^ ^^^^^"^ ^^^^ ^^^^ ^^"g the prst n section f I ] signal 

The ICU will assert all the hardware interrupts using the protocol described in section lT.9. The particular 
types ofSRAMs (for LEON caches) and register files used will be determined during the implementation 
hardware multipliers are notexpected to be required. Furthermore it is anticipated thai 
SoPEC will use the recommended 8 register window configuration 

Further details of the SPARC V8 instruction set and the LEON processor can be found in f321 and r331 
respectively. i j i j 

1 1 -6 Memory Management Unit (MMU) 

Memory Management Units are typically used to protect certain regions of memory from invalid accesses 
to perfomi address translation for a virtual memoiy system and to maintain memory page status (swapped- 
m, swapped-out or unmapped) ^ rr 

The SoPEC MMU is a much simpler affair whose function is to ensure that all regions of the SoPEC mem- 
ory map are adequately protected. The MMU does not support virtual memory and physical addresses are 
used at all times - the one exception to this is the address translation of the reset vector. The SoPEC MMU 
supports a fuU 32-bit address space. A proposed memory map is shown in Figure 18 below. 
The lAMU selects the relevant bus protocol and generate the appropriate control signals depending on the 
area of memory being accessed. The MMU is responsible for performing the address decode and genera- 
tion of tiie appropriate block select signal as well as tiie selection of the correct block read bus during a 
read access. The MMU will need to support all of the bus transactions die CPU can produce including 
mterrupt acknowledge cycles, aborted transactions etc. 

When an MMU error occurs (such as an attempt to access a si^ervisor mode only region when in user 
mode) a bus error is generated. While die LEON can recognise different types of bus error (e.g. data store 
error, instruction access error) it appears to handle them in the same manner as it handles all traps i.e it will 
transfer control to a trap handler. No extra state information appears to be stored because of the nanire of 
the trap.The location of the trap handler is contained in the TBR (Trap Base Register). This is the same 
mechanism as is used to handle interrupts. Furtiier investigation is needed to determine exactly how LEON 
behaves when a bus error type trap occurs to determine the best approach to handling bus errors. It may be 
simplest to just treat them as the highest priority interrupt 
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OxFFFF^FFFF 



Accesses in this 
area are not 
allowed and 
result in a bus 
error exception. 



Accesses in this 
area are via the 
CPU bus and are 
controlled by 
pemnissions set in ' 
each peripheral. 



Accesses in this 
area are via the 
DIU bus and are 
controlled by 
permissions set \n\ 
the MMU. 




PCU Mapped Registers 



Peripheral Registers 



ROM 



DRAM 



Ox002A^COO0 
0x002^.0000 
0x0029^0000 
0x0028^0000 




ORAM 
Regions 



0x0000.0000 



Figure 18. Proposed SoPEC CPU memory map (not to scale) 
1 1-6.1 CPU-bus peripherals address map 

The address mapping for the periphenUs attached to the CPU-bus is shown in Table 16 below. The MMU 

u '^^^ '^^^^ ^P^^^^ock^select signal. Apart from the 

PCU. which decodes the address space for the PEP blocks, each block only needs to decode as many bits 
of cpu^adrfl 1:2] as required to address all the registers within the block. 

Table 16. CPU-bus peripherals address map 



MMU_bas8 


0x0029 JDOOO 


TIM_base 


0x0029_1000 


L5S_base 


0x0029_2000 


GPIO_base 


0x0029^3000 


SCB_base 


0x0029_4000 


ICU^baso 


0x0029_5000 


CPR^base 


0x0029.6000 


ROM_bas6 


0x0029_7000 


OIU.t>ase 


0x0029_8000 


PSS_base 


0x0029.9000 
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Table 16, CPU-bus peripherals address map 



e 






Reserved 


0x0029^000 to 0xO029^FFFF | 


PCU.base 


Ox002A_0000 



11.6^ DRAM Region Mapping 

The embedded DRAM is broken into 8 regions, with each region defined by a lower and upper bound 
address and with its own access pennissions. 

The association of an area in the DRAM address space with a MMU region is completely under software 
control. Table 17 below gives one possible region mapping. Regions should be defined according to their 
access requirements and position in memory. Regions that share the same access requirements and that are 
contiguous in memory may be combined into a single region. The example below is purely for indicative 
purposes - real mappings are likely to differ significandy from this. Note that the RegionBottom and Regi- 
onTop fields in this example are byte aligned and would need to be right-shifted by 5 places to obtain the 
256-bit aligned value used to program the RegionNTop and RegionNBottom registers, or more details see 
11.6.5.1 and 11.6.5.2, 



Table 17. Example region mapping 









0 


OxOOOOjOOOO 


OxOOOO.OFFF 


SlJverbrook OS (supervisor) data 


1 


0x0000.1000 


OxOOOO.BFFF 


Sitvertrock OS (supervisor) code 


2 


0x0000 jCOOO 


0x0000_C3FF 


Silverbrook (supervisor/user) data 


3 


OxOOOO_C400 


0x0000_CFFF 


Silverbrook (supervisorAiser) code 


4 


OxOO26_DO0O 


0X0026.D3FF 


OEM (user) data 


5 


0x0026^0400 


0x0026^DFFF 


OEM (user) code 


6 


Qx0O27_E000 


0)c0027_FFFF 


Shared Silvarforook/OEM apace 


7 


OxOOOOjOOOO 


0X0026.CFFF 


Compressed page store (supervisor data) 



1 1 .6.3 Non-DRAM regions 

As shown in Figure 18 the DRAM occi^ics only 2.5 MBytes of the total 4 GB SoPEC address space. The 
non-DRAM regions of SoPEC are handled by the MMU as follows: 

ROM (0x0028_0000 to 0x0028^FFFF): The ROM block wUl control the access types aUowed. The 
cpu_acode[l:0] signals wiU indicate the CPU mode and access type and the ROM block will assert 
rom^cpujyerr if an attempted access is forbidden. The protocol is described in more detail in section 
1 1.4.3. The ROM block access pennissions are hard wired to allow all read accesses except to the Fuse- 
ChipID registers which may only be read in supervisor mode. 

MMU Internal Registers (0x0029^0000 to 0x0029^0FFF): The MMU is responsible for controlling the 
accesses to its own internal registers and will only allow data reads and writes (no instruction fetches) 
from supervisor data space. All other accesses wiU result in the mmu^cpujberr signal being asserted in 
accordance with the CPU native bus protocol. 

CPU Subsystem Peripheral Registers (0x0029_1000 to Ox0029_.FFFF): Each peripheral block will 
control the access types allowed. Every peripheral will allow supervisor data accesses (both read and 
write) and some blocks (e.g. Trniere and GPIO) will also allow user data space accesses as ouUined in the 
relevant chapters of this specification. Neither supervisor nor user instruction fetch accesses are allowed to 
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Sionfl A3' " *° ^ P"'t<'^°' described in 

PCU Mapped Registers (Ox002A_0000 to Oz002A BFFF)- AU of th.. prp W/v^Ve • — . u- t. 

accessed by the CPU via the PCU wU, inherit the access pln'Siot^f ^TpSS'?,:^^^^ 

Unused address space (Ox002A_COOO to OxFFFF.FFFF): All accesses to the unused portion of the 
^r^Z"-^^' " the «««_cpO.,. signal being asserted in accordance vShe Jpu 

protocol. These accesses will not propagate outside of the MMU Le. no external access wUI be^ti! 



11.6.4 



11.6.5 



Reset exception vector and reference zero traps 

processor starts executing code from address 0x0000 0000. On SoPEC the 
embedded DRAM oca^ies this area of the address map. As the DRAM contents ^ unde&ed wte^ Z 

™^Sm?!f^ T • '° "'^"^"^ 0x0000.0000 through OxOOOO_00?? (the mlni- 

sToZo^^So^s;oSjsSo^^^ 

i:^r"T° I"®.'} "^^-"'ferencing or null pointer de-ieferencing (where the piognun attempts to 

^S^t 0x0000.0000). To assist software debug *e MMU will £ert Jt,^ Cr 

every time the reset locations are accessed after the reset trap handler has legitimately been retrieJ^d 
^^dly ^after «set. If desired this condition could be result in a unique"^^ (e.^g. a^cS^Tnl 

MMU Configuration Registers 

These are the only configuration registers in the CPU blocL Note that all the MMU configuration registers 
may only be accessed when the CPU is nmning in supervisor mode. gu^uon registers 



Table IB. MMU Conflguratfon Registers 




0x04 



0x08 
OxOC 



0x10 



0x14 



0x18 



OxIC 



RegionOBottom 



RegionOTop 



Regloni Bottom 



Region ITop 



Regfon2Bottom 



Reg{on3Top 



RegionSBottom 



Region3Top 



11 



17 



17 



17 



17 



17 



17 
17 




0x0.0000 



OxFJFfFF 



0x0.0000 



OxO.OOOO 



OxO_0000 



0x0_0000 



OxO_0000 



0x0.0000 



This register contains the physical address that 
marks the bottom of region 0 



This register contains the physical address that 
marks the top of region 0. Region 0 covers the 
entire address space after reset whereas aJt 
^ther regions are zero-sized initially. 



This register contains the physksal address that 
^narks the bottom of region 1 



This register contains the physical address that 
marks the top of region i 



This register contains the physical address that 
marks the bottom of region 2 



This register contains the physical address that 
marks the top of region 2 



This register contains the physical address that 
marks the bottom of region 3 



This register oontains the physical address ihat 
marks the top of regfon 3 
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Table 18. MMU Configuration Registers 









mm 




0x20 


Reoton4Bottom 


17 


0x0^0000 


This register contains the physical address that 
marks the bottom of region 4 


0x24 


Region4Top 


17 


Ox0_0000 


This register contains the physical address that 
marks the top of region 4 




riegionoDOttom 


17 


0x0.0000 


This register contains the physical address that 
iiKiiKo ui9 uuiiorn oi region o 


0x2C 


RegionSTbp 


17 


0x0 oooo 


This register contains the physical address that 
marks the top of region 5 


0x30 


ReglonOBottom 


17 


0x0.0000 


This register contains the physical address that 
marks the bottom of regk>n 6 


0x34 


RegionGTop 


17 


OxO_0000 


This register contains the physical address that 
marks the top of region 6 


0x38 


Reoion7Botlom 


17 


0x0.0000 


This register contains the physical address that 
marks the bottom of region 7 


Ox3C 


RegionTTop 


17 


0x0.0000 


This register contains the physical address that 
marks the top of region 7 


0x40 


HegionvA^<Hitroi 


6 


0x07 


Control register for region 0 


0x44 


Regjonl Control 


6 


0x07 


Control register for region 1 


0x48 


Region2Control 


6 


0x07 


Control register for region 2 


0x4C 


RegionaControl 


6 


0x07 


Control register for region 3 


OxSO 


Reg[on4Ccntrol 


6 


0x07 


Control register for region 4 


0x54 


ReglonSControl 


6 


0x07 


Control register for region 5 


0x58 


Region6Control 


6 


0x07 


Control register for region 6 


0x5C 


RegionTControl 


6 


0x07 


Control register for region 7 


0x60 . 


BusTimeout 


16 


OxOOFF 


This register should be set to the number of pclk 
cycles to wait before aborting an access with a 
txjs error. 


0x64 


0ebugS«l6Ct 


7 


0x00 


Contains address ot the register selected for 
debug observatfon. It is expected that a number 
of pseudo-registers win be made available for 
debug observation and these wiU be outlined 
during the implementation phase. 



11.€.5.i RegionTop and RegionBottom registers 

The 20 Mbit of embedded DRAM on SoPEC is arranged as 81920 woitJs of 256 bits each. All region 
boundaries need to align with a 256-bit word Thus only 17 bits are required for the RegionNTop and 
RegionNBottom registers. The byte address of these locations can be obtained by simply left-shif^g the 
register value by 5 bits i.e. cpu^adrpL OJ - RegionNTop/Bottom[16:0] « 5. 

Both ihQ RegionNTop 2di& RegionNBottom registers are inclusive i.e. the addresses in the registers are 
included in flie region. The size of smallest active region is therefore 2 256-bit words i.e. 64 bytes. 

If DRAM regions overlap (there is no reason for this to be the case but there is nothing to prohibit it either) 
then only accesses allowed by all overlapping regions are permitted That is if a DRAM address appears in 
both Regionl and Region3 (for example) the cpujacode of an access is checked against the access permis- 
sions of both regions. If both regions permit the access then it will proceed but if either or both regions do 
* not permit the access then it will not be allowed. 
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The MMU does not support negatively sized regions i.e. the value of the RegionNTop register should 
Jdways be greater that the value of the RegionNBottom register. If RegionNTop is lower in the address map 
than RegionNTop then the region is considered to be zero-sized and is ignored. 

When both the RegionNTop and RegionNBottom registers for a region contain the same value the region is 
then simply one 256-bit word in length and this corresponds to the smallest possible active region. 

11.6.5.2 Region Control registers 

Each memory region has a control register associated with it. The RegionNControl register is used to set 
tite access conditions for the memory region bounded by the RegionNTop and RegionNBottom registers. 
Table 1 9 describes the function of each bit field in the RegionNControl registers. All bits in a RegionNCon- 
trol register are both readable and writable by design. However, like all registers in the MMU Uie 
RegionNControl registers can only be accessed by code running in supervisor mode. 

Table 19. Region Control Register 









SupervisorAocess 


2:0 


Denotes the type of access allowed when the CPU is running in 
Supervisor mode. F=or each access type a 1 indicates the access is 
permitted and a 0 indicates th© access is not permitted. 
bilO - Data read access permission 
biti - Data write access permission 
bjt2 - Instruction fetch access permission 


UserAccess 


5:3 


Denotes the type of access ailowed wiien the CPU is running in 
User mode. For each access type a 1 indicates the access is per- 
mitted and a 0 Indicates the access is not permitted. 
bit3 - Data read access permission 
bit4 - Data write access permission 
bits - instruction fetch access perrrdssion 



11.6.5.3 Status Register 

TTte SPARC V8 architecmre allows for a number of types of memory access error to be trapped. These trap 
QT»cs and tr^ handling m general are described in chapter 7 of the SPARC architecture manual P2] 
Accoiding to the SPARC architecnire manual fee processor will automaticaUy move to the next register 
window (I.e. It decrements the current window pointer) and copies the program counters (PC and nPQ to 
two local registers m the new window. The supervisor bit in the PSR is also set and the PSR can be saved 
to another local register by the trap handler (Ais does not happen automatically in hardware). 
Atthe time of writing it is not clear whether the LEON core can easily accept memory access error trap 
types (I.e. the 8-bit « field of the Trap Base register). Further investigation is needed to determine it this is 
possible and if existing trap types will cover the different types of bus error possible on SoPEC. Up to 32 
implementation specific trap types are allowed so conditions unique to SoPEC can be handled in this man- 
ner. 

If it is not possible for sufficient information about the cause of the bus error to be passed to the LEON 
core using die above mechanisms then a status register wiU be implemented to record the relevant informk- 

11.6.6 MMU Sub-block partition 

As can be seen from Figure 19 and Figure 20 the MMU consists of five principal sub-blocks. For clarity 
the connections between these sub-blocks and other SoPEC blocks and between each of the sub-blocks are 
shown in two separate diagrams. 
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hwdatB[31:0) 
hfdata(3l:0j 

hwrits 
htran8[1:0} 
hsize[2:0] 
fiburst(2:0J 
hproi{3:0] 
hmastflrl3:0j 
hmasterfock 



hrespflX)] < 
hsp<it(15:0] ^ 





1^ 



dmm_cpu_(lata(2S5:0] 

Efiu-cpu.rack ' 
dUf_cpu_rvaad 
cpu_dlu_wreq 
diu.cpu.wack 
cpu_diu_«warid 
cpu_(ffu_wmask(i :0] 

»cu_cpujlevol(3:0| 
cpu_lack 

cpu_1cu_Bevel[3:0] 
cpu.fwn 

cpu_dataout[3l:0] 

cpu.aifi(21:0) 
cpu.acode[1:01 

cpu.cpr.sel 
opu_dju_$Qf 
cpu_gpto_sel 
cpu_icu_sel 
q3U_lss_sel 
cpu_pcu_sel 
cpu_scb_sef 

cpu_fDm'_sel 
cpu.diu_8al 

cp''^cpu,daia(31:01 
d2u_cpu_data|31 :0] 
0pio_cpu_data{3l :0) 
icu_€pu_data(31:0] 
tss_cpu.data(31:0] 
pcu^cpu.data[3i.'0] 
ficb.cpu_data(3l :0J 
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rom.cpu_data(3 1 :0] 
pss.cpu_data(3l .-0) 



Figure 19. MMU Sub-block partition, external signal view 
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Figure 20. MMU Sub4ilock partition, internal signal view 

i 1.6,6.1 LEON Bridge 

At the time of writing it is expected that the LEON core will be used with its AHB interface rather than be 
modified to comply with the protocols used on SoPEC, in particular the DIU protocol for DRAM access. 
The LEON bridge consists of an AHB bridge and some glue logic. The AHB bridge will convert between 
the AHB and the DIU and CPU subsystem bus protocols. The AHB bridge will always be a slave on the 
AHB. Glue logic will be required to assist with cndianness coherency, interrupts and other miscellaneous 
signalling. 



Table 20. LEON bridge l/Os 











Glotial SoPEC signals 


prst_n 


1 


In 


Global reset Synctironous to pdlc, active low. 


pdk 


1 


tn 


Global dock 


LEON Bridge to AHB signals 


haddr(31.*C] 


32 


In 


AI-IB address bus 


hwdata[31:0} 


32 


In 


AHB write data bus 


hrdata[31:0] 


32 


Out 


AHB read data bus 


hsel 


1 


In 


AHB slave select signal 
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Tabfe 20. LEON bridge VOs 









hwrite 


1 


In 


AHB write signal: 
1 > WrTte access 
0 - Read access 


htrans 


2 


In 


Indicates the type of the current transfer 

00 -IDLE 

01 • BUSY 

10 - NONSEQ 

11-SEQ 


hsfze 


3 


In 

in 


Indicates the size of the current transfer: 

000' Byte Uansfer 

001 - Halfwofd transfer 

010- Word transfer 

Oil • 64-blt tnansfier (unsupported?) 

1 XX * UnsunDOrtivl (AiriAr u/nrHcWao 


hburst 


3 


In 


Indicates If the current transfer forms part of a burst and the tvoe of 
burst 

000 . SINGLE 
001 - INCR 
010-WRAP4 
on - INCn4 
100 -WRAPS 
101 - INCR8 
110-WRAP16 
111 -INCR16 


hprot 


4 


In 

in 


Protection control signals pertaining to the current access: 
hprotJO] - Opcode<0) / Data(1) access 
hprot[1 J - User(0) / Supervisor access 

hprotf21 - Non-bufferable/OV /Buffdnihin/i^ r^nnj^^in titr%mtnn^w*iAf4\ 
hprot(3] - Non-cacheable(0) / Cacheable access 


hmaster 


4 


In 


Indicates the identity of the cun^ent bus master. This will always be 
the LEON core. 


hmastlock 


1 


in 


Indicates that the current master is performing a locked sequence 
of transfers. 


h ready 


1 


Out 


Active high ready signal Indicating the access has completed 


liresp 


2 


Out 


Indicates the status of the transfer: 
00 -OKAY 
01 - ERROR 
10 -RETRY 
1 1 - SPUT 


hspin 


16 


Out 


This 16-blt split bus is used by a slave to Indicate to the art)iter 
which bus masters should be allowed attempt a split transaction. 
This feature wiO be unsupported on the AHB bridge 


Toptevel/ Common LE 


EON bridge signals 


cpu_dataout(31K)] 


32 


Out 


Data out bus to both DRAM and peripheral devices. 


cpLf_xwn 


1 


Out 


Read/NotWrite signal. 1 s Current access Is a read access. 0 « 
Current access is a write access 


icu_cpu.aevel[3:0] 


4 


In 


An interrupt Is asserted by driving the appropriate priority tevel on 
fcu^cpuJfevBi These signals must remain asserted until the CPU 
executes an interrupt acknowledge cycle. 


cpu_icuJlevel(3:0J 


4 


In 


Indicates the level of the inten-upt the CPU is acknowledging when 
cpu^iack\s high 
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Table 20. LEON bridge l/Os 







9-m 






1 




mternipt acknowledge signal. The exact timing depends on the 
CPU core Implementation 


cpu.slart.aocess 


1 


Out 


Start Access signal Indicating the start of a data transfer and that 
the cpt4_adn cpu_dataout, cpu^rwn and cpt/.acocte signals are all 
valid. This signal is only asserted during the firet cyde of an access 


Opu.ben[1 :0] 


2 


Out 


Byte enable signals. 


LEON core to LEON 


brfdge sig 


naia ' ■ 


iuUrl 


4 


Out 


Interrupt level request to the LEON Integer Unit 


luoJrl 


4 


In 


Acknowledged intemjpt level from the LEON Integer Unit 


ruo.rntack 


1 


In 


Interojpt acknowledge signal from the LEON Intener Unit 


LEON bridge to MMU 


Control Block signals 


cpu_mmu_adr 


32 


Out 


CPU Address Bus. 


'nmu_cp u_data 


32 


In 


Data bus from the MMU 


fnmu_cpu_rdy 


1 


In 


Ready signal from the MMU 


cpu_mmu_acode 


2 


Out 


Access code signals to the MMU 


mmu_cpu_befr 


1 


fn 


Bus eiTor signal from the MMU 



Description: 

l^e L^ON bridge must ensure that all CPU bus and interrupt tiansactions are functionally con^ct and that 
"^TTT"^ T ™' "^^'^^"^^ responsible for ensuring endianness coherency i.e. 
guaranteeing that the correct data appears in the coirect position on the data buses (hrdata. cpu ,Zaaut 
and mmu^cpudata) for every type of access. This is a requirement because the LEON us«bii-endian 
addressing while the rest of SoPEC is little-endian. oig-enoian 

It is expected that some signals (especially those external to the CPU block) will need to be registered here 
to meet the tmung requirements. Careful thought wiU be required to ensure that overall CPU access times 
are not excessively degraded by the use of too many register stages. 

11.6.6.2 DIU Bus Interface 

The DIU bus interface will handle all valid accesses to the embedded DRAM via the DIU. The DIU bus 
Sd tL^i^enl^ ""^''"^ *° ^ ^ "^""^ ^ arbitration 

Table 21. DIU Bus Interface l/Os 



Global SoPEC signals 


IB 


MM. 




prst^n 


1 1 




I Global reset. Synchronous to pcfk, active tow. 


pdk 


1 1 


1 1" 


1 Global dock 


Toplevel/Common DIU 


Bus Interface signals 


dram_cpu_data(255.-0I 


256 


in 


Read data from the DRAM. 


cpu_diu_rreq 


1 


Out 


Read request to the DIU DRAM 


diu_cpu_radc 


1 


In 


Acknowledge from DIU that read request has been accepted 


diu.cpu.rvalld 


1 


In 


Signal from DIU indicating that vaiki read data is on the 
dram cpu_databus 


cpu_dlu_wreq 


1 


Out 


Write request to the OIU 
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Tabre 21. DIU Bus Interface UOs 











dlu_cpu_wack 


1 


In 


Acknowledge from the DIU that the write request has lieen 
accepted 


cpu_diu_wvalid 


1 


Out 


Signal from the CPU to the DIU Indicating that the data currently on 
the cpu^dataout bus is valkJ 


cpu_diu_wmask(1 :0) 


2 


Out 


Rag Indicating format of CPU write to DRAM. These signals are 

directly derived from the cpujben signals 

(^jcfiu_wmask = 00: 8-blt write 

cpujdiu^wmask s 01 : 16-bH write 

cpa_diu^wmask ^ 10: 32-bit write 

cptJLdiu_wmask = 11: reserved 

cpu_adr(2:0] are driven in accordance with the width of the data 
access indicated by cpujdiu^wmask. Addresses cannot cross a 
256-brt word DRAM boundary. 


drain_fdy 


1 


Out 


Data Ready signal. Indicates the data on the dram_cpu_data bus is 
valid for a read cyde or that the data was successfully dispatched 
to the DIU for a write cyde. 


uiu BUS interface to MMU Control Block signals 


cpu_adr(21:0] 


22 


In 


Toplevel CPU Address bus. 


dFam_data(3l:0] 


32 


Out 


Oau bus containing the 32 bits addressed by cpu_adif4:2]fiom the 
256-bit DRAM read bus dram_^u_data 


dranuaocess.en 


1 


In 


Enable Access signal, A DRAM access cannot be initiated unless It 
has been enabled by the MMU Control Unit 


DIU Bus Interface to ICache s IgnaJs 


ic_cac*ie_hlt 


1 


In 


Cache hit signal from the ICache. This indicates that the current 
CPU read request is being serviced by the ICache and so should 
not be retrieved from the DRAM. 


DIU Bus Interface to LEON bridge signals 




Gpu.ben[1:0) 


2 


In 


Byte enable signals from the LEON bridge. These are Ibnwarded on 
to the DIU as the cpu_diu_wTnask signals 


cpu_8tait_access 


1 


In 


Start Access signal from the LEON bridge indicating the start of a 
data transfer and that the cpu^adc cpu^dataout cpu_rwn and 
cpu^acode signals are all valW. This signal is only asserted during 
the first cyde of an access. 



Description: 

The DIU Bus Interface handles all data transfers between the CPU (or ICache) and the DIU. This involves 
translating between the different protocols used on the DIU and CPU buses. The validity (i.e. is the CPU 
running in the correct mode for the address space being accessed) of an access is determined by the MMU 
Control Block which also checks that a DRAM access does not cross a 256-bit boundary (as required by 
the DIU) and the dram^access^en is asserted if it is a valid access. Invalid accesses do not initiate DRAM 
accesses. The operation of the DIU Bus Interface is described by the state machine shown in Figure 21 and 
the DIU bus protocol is described in more detail in section 20.9. The DIU wiU renim a 256-bit dataword 
on dram^cpu_data[255:0] for every read access. The DIU Bus Interface must select the appropriate 32-bit 
word from this according to the word address given by cpu_adrf^:2j. 
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Pfgt n'"Q 

cpu.diu_wreq = 0 
cpu.dhj.wvalid » 0 
drafn.rdyaO 



ppu Start apcess ,«f«Q 




dram_rdy 



dlu CPU wvaffdjgfQ 



Figure 21. DIU Bus Interface state machine 



1 1. 6. 6. 3 CPU Subsystem Bus interface 

The CPU Subsystem Interface block handles all valid accesses to the peripheral blocks that comprise the 
CPU Subsystem. 



Table 22. CPU Subsystem Bus Interface l/Os 









Global SoPEC signals 


prst_n 


1 


In 


Global reset. Synchronous to pdk. active low. 


polk 


1 


In 


Global dock 


Toptevel/Common CPU Subsystem Bus Interface signals 


cpu_cpr_sel 


1 


Oul 


CPR l>lock select 


cpu_jgpio_sel 


1 


Out 


GPIO Wock select. 


cpu_rcu_sel 


1 


Out 


ICU bkx;k select 


cpujss_sel 


1 


Out 


LSS block select. 


cpu_pcu_sel 


1 


Out 


PCU block select. 
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Table 22. CPU Subsystem Bus Interface l/Os 



cpu_scb_sel 


1 


Out 


SOB Wock select. 


cpu.tim^seJ 


1 


Out 


Timers block select. 


cpu_rom_8el 




Out 


ROM block select. 


cpuj)ss_sel 


— 


Out 


PSS block select. 


cpu_diu_sel 




Out 


DIU block select 


cpr_cpu_data[31 :0] 


32 


In 


Read data bus from the CPR block 


gpio.cpu_data[31 :0] 


32 


In 


Read data bus from the QPIO block 


lcu_cpu_data|31:0] 


32 


In 


Read data bus from the ICU block 


tes.cpu_data[31:0] 


32 


In 


Read data bus from the LSS block 


pcu.cpu_data(31 K)) 


32 


In 


Read data bus from the PC U Wock 


scb_cpu_data[31:0] 


32 


In 


Read data bus from the SCB block 


tjm_cpu_data[31 :0] 


32 


In 


Read data bus from the Timers block 


rom_cpu_da1a[31 :0J 


32 


In 


Read data bus from the ROM block 


pss_cpu_data(31 :0) 


32 


In 


Read data bus from the PSS bk>ck 


diu_cpu_dataI31 :0] 


32 


In 


Read data bus from ttie DIU block 


cpr_cpu_rdy 


1 


In 


Ready signal to the CPU. When cpr_cpu^rdyis high it indfcates the 
last cycle of the access. For a write cyde tNs means cpu^dataout 
has been registered by the CPR block and for a read cycle this 
means the data on cpf cpu data, is valid. 


gplo_cpu_rdy 


1 


In 


GPIO ready signal to the CPU. 


Icu.cpu_rdy 




in 


ICU ready signal to the CPU. 


lss_cpu_rdy 




In 


(.^S ready signal to the CPU. 


pcu_cpu_rdy 


1 


In 


PCU ready signal to the CPU. 


8cb_cpu_fdy 




In 


SCB ready signal to the CPU. 


tlm_cpu_rdy 




In 


Timers block ready signal to the CPU. 


rom_cpu_rdy 




In 


ROM btock ready signal to the CPU. 


pss_cpu_rdy 




In 


PSS btock ready signal to the CPU. 


diu_cpu_rdy 




In 


DIU register bkx:k ready signal to the CPU. 


Cpr_cpu_berr 




In 


Bus Error signal from the CPR block 


gpio_cpu_berr 




In 


Bus Error signal from the GPIO block 


teu_cpu_berr 




In 


Bus Error signal from the IpU block 


*sa-Cpu_berr 




In 


Bus Error signal from the LSS block 


pcu_cpu_berr 




In 


Bus Error signal from the PCU block 


scb_cpu_berr 




In 


Bus Error signal from the SCB btock 


tim^q^u.berr 




In 


Bus Error signal from the Timers block 


rom_cpu.berr 




In 


Bus Error signal from the ROM block 


psa_cpu_berr 




In 


Bus Error signal from the PSS block 


dju_cpu_berr 




In 


Bus Error signal from the DIU block 


CPU Subsystem Bus Interface to MMU Control Block signals 


cpu.adrt19:12] 


6 


In 


Toplevel CPU Address bus. Only bits 1 9-12 are required to decode 
the peripherals address space 


peri_access_en 


1 


In 


Enable Access signal. A peripheral access cannot be initiated 
unless it has been enabled by the MMU Control Unit 
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Table 22. CPU Subsystem Bus Interface l/Os 









Data bus from the selected peripheral 


peri_mmu_rdy 


1 


Out 


Data Ready signal. Indicates the data on the peri mmu data bus is 
valid for a read cyde or that the data was successfully written to the 
peripheral for a write cyde. 


pei1_mfnu_berr 
CPU Subsystem Bus 


1 

Interface t 


Out 
O LEON br 


Bus Error signal. Indicates a bus error has occurred in accessina 
the selected peripheral 

Idge signals 


cpu.start^access 


1 


In 


Start Access signal from the LEON bridge indicating the start of a 
data tmnsfer and that the cpu^adr, cpu^dataout, cpiLnw? and 
cpu^acode signais are all valid. This signal is oniy asserted during 
the first cyde of an access. 



// The peri_access_en signal will have the 
// timing required for block selects 



Description: 

The CPU Subsystem Bus Interface block perfonns simple address decoding to select a peripheral and mul- 
bple«ng of die renmied signals from the various peripheral blocks. The base addr^sL used"e 

bv tie T^ ^^f "I™? *^ ^^"^ ^ configuration register are handled 

by the MMU Control Block rather than the CPU Subsystem Bus Interface block. The CPU Subsystem Bus 
Interface block operation is described by the following pseudocode: 

raasked^cpu_adr = cpu^adr ( 19 • 12 J 
case <inasked_cpu_adr) 
when TIM_basetl9:12] 

cpu_tinusel = peri_access_en 

peri_nmiu_data = t im__cpu_data 

peri_pnmu_rdy s tim_cpu_rdy 

peri_iranu_berr = tim_cpu_berr 

«ll_other_8elects = 0 // Shorthand to ensure other epu_blocK_SBl signals 

// remain deasserted 

when LSS_basef 19; 12} 

cpu_lss_sel • peri_occess_en 

perijijmu_data = lss_cpu_data 

peri_|!TOu_rdy a lss_cpu_rdy 

peri_inmu_berr = lss_cpu_berr 

all_other_select3 = 0 
when GPIo_basetl9:12) 

cpu_gpio_sel = peri_access_en 

peri_inniu_dat:a « gpio_cpu_data 

perijnrm^rdy ^ gpio_cpu_rdy 

peri,jnrau_berr = gpio_cpu^berr 

all_other_selects = 0 
when SCB_base(19:12] 

cpu_6cb_sel = peri_access_en 

peri^;ninu_^data = scb_cpu_data 
perijmmjurdy « scb_cpu_rdy 
perijmnu^berr = scb_cpu_berr 
all.other^selects = 0 
when ICUJk>ase[l9:12] 

cpu^icu_s€l = peri_access_en 
per ijnmu_data = icu_cpu_data 
perijnrau_rdy = icu_cpu_rdy 
peri_;nniu_berr = icu_cpu_berr 
al-l_other_fieiects *« 0 
when CPR_ba3e|19:12J 

cpu_cpr_fiel = peri_accesfi_en 
peri_jnniu_data = cpr_cpu_data 
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peri_inmu_rdy a cpr^cpu«rdy 
peri_;nnu_berr = cpr_cpu_berr 
all.other^selects » 0 

when ROH.base(19:121 

cpu_ronv_8el - peri.access^en 
peri^jnmu.data = roxiccpu.data 
peri_piinu_rdy = ronucpu.rdy 
perl_;!imu.berr = roRv_cpuJberr 
aIl_other.8elects « 0 

when P5Sjba&e[19:12] 

Cpu_^8s.8el = peri.access^en 
peri.jnmu_data = pss_cpu.daca 
perijnmu^rdy » pss_cpu_rdy 
peri_pRinu_berr = pss.cpujberr 
aII_other_selec.ts = 0 

when Dlu_ba9e(X9:12) 

cpu_diu_8el 8 peri^ccess.en 
peri_;niau_data « diix_cptjudata 
peri_inmu_rdy = diu_cpu_rdy 
peri_|iimu.berr = diu_cpu_berr 
aIl_other_&elect8 « 0 

when PCU_ba8eri9:121 

cpu^diu^sel = peri^access^en 
perijnmuu_data = pcu_cpu_data 
peri_jiniu.rdy « pcu_cpu_rdy 
perij!iiiu_berr = pcu^cpujlierr 
all_,other_8elect8 = 0 

when ochers 

all.block^selects = 0 
perijnmu_data = OxOOOOOOOO 
peri_jmmu_rdy c o 
peri_mmu_berr = 1 

end case 



11.6.6,4 MMU Control Block 



The MMU Control Block determines whether every CPU access is a valid access. No more than one cycle 
is to be consumed in determining the validity of an access and all accesses must terminate with the asser- 
tion of either mmu_cpu_rdy or mmu_cpujberr. To safeguard against stalling the CPU a simple bus timeout 
mechanism will be si^ported. 



Table 23. MMU Control Block i/Os 









Global SoPEC signals 


prst_n 


1 


In 


Global reset Synchronous to pdk, active k>w. 


pcfk 


1 


In 


Global dock 


Toplevel/Common MMU Control Block aignalB 


cpu_adr[21:0] 


22 


Out 


Address bus for both ORAM and peripheral access. 


cpu_aooc(e(1 :0) 


2 


Out 


CPU access code signals {cpu^mmu^siccxie) retimed to meet the 
CPU Subsystem Bus timing requirements 


diam_access«en 


1 


Out 


DRAM Access Enable signal. Indicates that the current CPU 
access is a valid DRAM access. 


MMU Control Block to LEON brfdge signals 
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Table 23. MMU Control Block l/Os 











cpu_mmu_aai3i :oj 


32 


In 


CPU core address bus. 


cpu.(lataout[31.*0] 


32 


In 


Toplevel CPU data bus 


nnmu_cpu_data[31 :0] 


32 


Out 


Data bus to the CPU core. Can'les the data for all CPU read opera- 
tions 


cpu.rwn 


1 


In 


Toplevet CPU Read/notWrite signal. 


cpu.mmu_acode(1 :0] 


2 


In 


CPU access code signals 


inmu_cpu_fdy 


1 


Out 


Ready signal to the CPU core. Indicates the completion of all valid 
CPU accesses. 


fimui CDU t%tirr 


1 


Out 


Bus Error signal to the CPU core. This signal is asserted to termi- 
nate an Invalid access. 


cpu-Stail.access 


1 


In 


Start Access signal from the LEON bridge Indicating the start of a 
data tmnsfer and that the cpu^adr, cpu_^(Sataout, cpu rwn and 
cpu^acode signals are all valid. This signaJ is only asserted during 
the first cyde of an access. 


cpujack 


1 


In 


Interrupt Acknowledge signal from the CPU. This signal Is only 
asserted during an interrupt acknowledge cyde. 


cpu_ben[1 .-0] 


2 


In 


Byte enable signals wKJtoatfng whk:h bytes of the 32-bit bus are 

being accessed. 


MMU Control Bfock to 


OIU Bus Interface signals 


drani_rdy 


1 


In 


Data Ready signal. Indicates the data on the dram^cpu data bus is 
valid for a read cyde or that the data was successfiflly dispatched 
to the DIU tor a write cyde. 


MMU Control Block to 


1 Cache signals 


ic_data[31:0] 


32 


tn 


Data bus from the 1 Cache 


ic_rdy 


1 


In 


Ready signal from the ICache indicating the data on tc data is valid 


MMU Control Block to 


CPU Subsystem Bus Interface signals 


peri_access_en 


1 


Out 


Enable Access signal. A peripheral access cannot be initiated 
unless it has been enabled by the MMU Control Unit 


perLmmu_data(31 .-0} 


32 


In 


Data bus from the selected peripheral 


peri_mmo_fdy 


1 


In 


Data Ready signal. Indfoates the data on the peri^mmu data bus Is 
valid for a read cyde or that the data was successfully written to the 
peripheral tor a write cyde. 


peri_mmu_berr 


1 


In 


Bus Error signal. Indicates a bus error has occurred in accessing 
the selected peripheral 



Description: 



The MMU Control Block is responsible for the MMU's core fiinctionality. namely detennining whether or 
no tan access to any part of the address map is valid. An access is considered valid if it is to a mapped area 
tK "^"8 '° *^ appropriate mode for that address space. Furthermore 

the MMU control block must coirectly handle the special cases that are: an intemipt acknowledge cycle a 
reset ratcephon vector fetch, an access that crosses a 256-bit DRAM word boundary and a bus timeout 
oonditton The following pseudocode shows the logic required to implement the MMU Control Block 
functionality. It does not deal with the timing relationships of the various signals - it is the designer's 
responsibility to ensure that these relationships are correct and comply with the different bus protocols 
For simplicity the pseudocode is split up into numbered sections so that the functionality may be seen 
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PSO Description: This first segment of code defines a number of constants and variables that are used 
elsewhere in this description. Most signals have been defined in the I/O descriptions of the MMU sub- 
blocks that precede this section of the document. The post^reset^tate variable is used later (in section 
PS4) to detennine if wc should translate the reset exception vector address or trap a null pointer access, 

PSO: 

const UnusedBottom = Ox002ACOOO 
coneC DRAMTop = Ox0027FFFF 
const UserDataSpace » bOl 
const UserPrograitiSpace » bOO 
const SuperviaorDaCaSpace = bll 
const SupervisorProgramSpace = blO 

const timeout^limit = 0x40 // Need to confirm that this is a suitable valve 
const ResetBxceptionCycles = 0x8 



cpu_adr_peri_;nas)ced[7:01 = cpu_mmu_adr [19: 12] 
cpu^adr^dramjMisked ( 1 6 : 0 ] « cpu_mmu^adr & 0x003FFPE0 

if (prst^ == 0) then // initialise everything 

cpu_adr « cpu_iraiiu^adr[21 : 01 
peri_access_€n = 0 
drain_access_en ^ 0 
nwu_cpu_data = peri_intnu_data 
inaiu_cpu^rdy = 0 
iremi_cpu_berr = 0 
post_rese testate = true 
acces3_initiated = FALSE 
cpu_access_cnt = 0 

// The following is used to detennine if we are coming out of reset for the purposes of 
// reset exception vector redirection. There may be a convenient signal in the CPU core 
// that we could use instead of this- 

if ( <cpu_start_access l) AND (cpu_access_cnt < ResetExceptionCycles ) AND 
(clock^tick == TRUE)) then 
cpu_jaccess_cnt = cpu_access_cnt +1 
else 

post_reset_state = FALSE 

PSl Description: This section is at the top of the hierarchy that determines the validity of an access. The 
address is tested to see which macro-region (i.e. Unused, CPU Subsystem or DRAM) it falls into or 
whether the reset exception vector is being accessed. 

PSl: 

if <cpu_nntiu_adr >= UnusedBottom) then 

// The access is to an invalid area of the address space. See section PS2 

elsif < (cpujnmu^adr > DRAMTop) AND (cpujnmu_adr < UnusedBottom)) then 

// We are in the CPU Subsystem/ PEP Subsystem address space. See section PS3 

// Only remaining possibility is an access to DRAM address space 

// First we need to intercept the special case for the reset exception vector 

elsif (cpu_xTimji_adr < 0x00000010) then 

// The reset exception is being accessed ^ See section PS4 

elsif ( (cpu_adr_drajn_mas)ced >= RegionOBottom) AND (cpu_adr_dram_masked <= 
RegionOTop) } then 
// We are in RegionO. See section PS5 

elsif (<cpu_adr_dram-mas)ted >= RegionNBottom) AND {cpu_adr_dranumas)ced <= 
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RegionNTop) ) then // we are in RegionN 
// Repeat the RegionO (i.e. section PS5> logic for each of Regionl to Region? 

else // We could end up here if there were gaps in the DRAM regions 

peri_access_en = 0 
drain_access_en = 0 

rarnu^cpu.berr = 1 //we have an unknown access error, most likely due to hitting 
™u_cpu_rdy =0 //a gap in the DRAM regions 



// Only thing remaining is to implement a bus timeout function. This is done in PS6 
end 

PS2 Description: Accesses to the large unused area of the address space arc trapped by this section. No 

bus transactions are initiated and the mmu^cpujaerr signal is asserted. 

PS2: 

elsif {cpu^mmi.adr >= UnusedBottom) then 

peri_access_en = 0 //The access is to an invalid area of the address space 
draiiu.acces8_en = 0 
™u_cpu_berr = 1 
inniu_cpu_rdy « 0 

PS3 Description: This section deals with accesses to CPU Subsystem peripherals, including the MMU 
Itself, If the MMU registers are being accessed then no external bus transactions are required Access to 
the MMU registers is only permitted of the CPU is making a data access from supervisor mode, otherwise 
a bus error is asserted and the access tenninated. For non-MMU accesses then transactions occur over the 
CPU Subsystem Bus and each peripheral is responsible for determining whether or not the CPU is in the 
correct mode (based on the cpu^acode signals) to be permitted access to its registers. Note that all of the 
PEP registers are accessed via the PCU which is on the CPU Subsystem Bus. 



PS3: 



elsif <(cpu^u_adr > DRAMTop) AND ( cpu jnmu^adr < UnusedBottom) ) then 
// We are in the CPU Subsystem/ PEP Subsystem address space 

cpu_adr = cpu_raniu__adr [21:0) 

if <cpu_adrjjeri_masJced == MKU.base) then // access is to local registers 
peri^access_en » 0 
dranL_«ccess_en = 0 

if (cpu_acode == Super^isorOataSpace) then 
for (i=0; i<26; i-^*) { 

if <(i =» cpu_rnmu_adr{6:2)) then // selects the addressed register 
if <cpu_rwn 1) then 

inmu_cpu_data (16:01 = MMUReg[i) // KMURegfi] is one of the 
inmu_cpu_rdy = 1 // registers in Table 18 

i"^u_cpu_berr = 0 
else // write cycle 

MMURegCil = cpu_dataout (16; 0| 
'w^u^cpu^rdy = 1 
mmu_cpu_berr » 0 
else // there is no register nvapped to this address 

inmu_cpu_berr » 1 / / do we really want a bu3_error here as registers 
mmu_cpu_rdy = 0 // are just mirrored in other blocks 

else // we have an access violation 
mrau^cpu_berr o i 
mmu_cpu_rdy « 0 

else // access is to something else on the CPU Subsystem Bus 
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peri_access_en = 1 • 
drain_access_en = 0 
mmu_cpu_data = peri jnmu^data 
innu_cpu_rdy = peri_minu_rdy 
inmu_cpu„berr = peri_jiimu_berr 

PS4 Description: The only correct accesses to the locations beneath 0x00000010 are fetches of the reset 
tr^ handling routine and these should be the first accesses after reset Here we trap all other accesses to 
these locations regardless of the CPU mode. This most likely cause of such an access will be the use of a 
null pointer in the program executing on the CPU. 

PS4: 

elsif <cpu_mmu_adr < 0x00000010) then //may need to translate a wider range - depends 
if (post„reset_state TRXJE) ) then // on how LEON handles the reset exception. 
cpu.adr(21:0] - {RGH_base{21 :3) , cpu_inmu_adr 1 2 : 0 J ) 
peri_access_en = 1 
draRL.access_en = 0 
™uL-Cpu_data » perijnimi_data 
nTOU_cpu_r<a^ = peri_p>imi_rdy ■ 
innu_cpu_berr e peri_jniniu^berr 
else // we have a problem <almoat certainly a null pointer) 
peri_access.en = 0 
dram^access.en = 0 
inmu^Cpu_berr » 1 
n'niu_cpu_rd^y = 0 

PS5 Description: This large section of pseudocode simply checks whether the access is within the bounds 
of DRAM RegionO and if so whether or not the access is of a type permitted by the RegionOControl regis- 
ter. If the access is permitted then a DRAM access is initiated for all data accesses and for instruction 
fetches that result in a cache miss. All instruction fetches are returned via the ICache interface regardless 
of whether they come from a cache hit or refill fi:om DRAM. If the access is not of a type permitted by tiie 
R^onOControl register then the access is terminated with a bus error. 

PS5: 

elsif ( (cpu_adr_dranu;aas]ced >= Region OBot torn) AND (cpu_adr_draiiujnasked <= 
RegionOTop) ) then //we are in RegionO 

//We need to check that the DRAM access does not cross a 256 -bit boundary 
// Only 16 or 32-bit CPU accesses are capable of traversing a 256-bit boundary 

if ( ( (cpu_pBnu_adr[4:0] == OxlF) AND <(cpu_ben == bOl) OR (cp\j_ben biO) > ) 
OR (<cpu_|nmu_adrt4:0] == OxlE) AND (cpu_ben blO) > ~ 
OR { (cpujnmu^adr[4 :0) =« OxlD) AND (cpu^en «» blO) > ) then 

peri_access_en = 0 

draiiL_access_en ^ o 

nTOu_cpu_berr = 1 

nTOu_cpu_rdy = 0 

else // access does not cross 256-bit boundary so we can proceed 
cpu^adr e cpu_;nniu_adr(21 : 0] 
if (cpu_rwn an 1) then 

if ( (cpu_acode == SupervisorProgramSpace AND RegionOControl (2 ] == 1)) 
OR (cpu.acode UserPrograjnSpace AND RegionOControl [5] 1)) then 

// this is a valid instruction fetch from RegionO 
peri.access_en = 0 
dranuaccess.en = 1 
nanu_cpu_data =* ic_data 
ininu_cpu_rdy = ic_.rdy 
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inmu_cpu„berr = 0 



elsif ((cpu^acode «= SupervisorDataSpace AND RegionOControl (01 == 1) 
OR <cpu_acodB aa UsorDataSpace AND RegionOControl C3 ) ==1)) then 

// this is a valid read access from RegionO 

0 



peri.access.en 
draxn^access.ezi = 1 

inmu_cpu_data = dranudata // possibly drc_data if dcache is used 
^ ^ // possibly drc_rdy 



xnmu_cpu„rdy 
nBniJL.cpu_]berr ■ 



dram^rdy 
= 0 



else 

peri_access_en = 0 
drarruaccess^en = 0 
inmu_cpu_berr = 1 
iniTiu_cpu_rdy =. 0 



//we have an access violation 



else // it is a write access 

if <(cpu_acode == SupervisorDataSpace AND RegionOControl C 13 == 1) 

OR (cpu.acode UserDataSpace AND RegionOControl [4) 1)) then 

// this is a valid write access to RegionO 



peri_acceas_cn a? 0 
draxrv^access.en = 1 
inmu_cpu_rdy s dram_rdy 
nanu_cpu_berr = 0 
else 

peri_access_en a 0 
drant.access.en = 0 
»ntu_cpu_berr = 1 
»nta__cpu_rdy = 0 



// possibly dwc_rdy if dcache is used 
//we have an access violation 



PS6 Description: This final section of pseudocode deals with the special case of a bus timeout. This 
occurs when an access has been initiated but has not completed before the timeout Jimit number of pclk 
cycles. While access to both DRAM and CPU/PEP Subsystem registers wiU take a variable number of 
cycles (due to DRAM traffic, PCU command execution or the different timing required to access registers 
in imported IP) each access should complete before the timeout Jimit occurs. Therefore it should not be 
possible to stall the CPU by locking either the CPU Subsystem or DIU buses. However given the fatal 
effect such a stall would have it is considered prudent to implement bus timeout detection. 

PS6: 

// Only thing remaining is to implement a bus timeout function. 

if ( (cpu_start_acces3 1) then 
access_initiated = TRUE 
timeout_countdown = BusTimeout 

if ( (mmu_cpu_rdy === 1 ) OR (romu_cpu_berr =«! )) then 
access.initiated = FALSE 
peri_access_en « 0 
dran^acces s_en = 0 

if ( (clocIc_tick == TRUE) AND (access.initiated == TRUE)) 
if (timeout_countdown > 0) then 

t ime ou t_c oun tdown - - 
else // timeout has occurred 

peri_access_en « 0 // abort the access 

dranuaccess^en « 0 
inmu_cpu_berr . = 1 
iwnu^cpu_rdy = O 
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J3 



I 11.6.6.5 iCache 

The ICache sub-block implementation is described in section 1 1.7.1.1. 



11.7 Cache 



The deasion on what type of caching solution to use on SoPEC is still open for the moment There are 
two probable solutions: a) use the LEON caches with a minimal configuration (1 KB I and D caches) and 
b) use separate, simple one line 256.bit caches for instruction, data read and data write accesses From a 
performance and (most likely) implementation point of view the LEON caches are the best solution how- 
ever they are much bigger than the one line caches (approx 6x). The one line caches do not offer the same 
degree of performance improvement as the LEON caches and are likely to add an extra cycle to aU mem- 
ory accesses. The perfomiance penalty for a LEON cache miss (i.e. for all mcmoiy accesses if we arc not 
usmg the LEON caches) and the the best and worst case access times from DRAM have yet to be fully 
deteraruncd The final decision on which caching solution to use will be made when all such infoimation is 
available. 

Therefore the section on caches, which was present in previous versions of this document but is now 
mostly out of date, has been removed (the ICache is still relevant if one line caches are used and so is 
retained). 



1 1 .7.1 Instruction Cache 



A caching mechanism would offer the advantage of greater aggregate performance while still guaranteeing 
a mimmum level of perfomance. While greater performance may not be required at present for this appli- 
cation the caching mechanism offers greater efficiency (i.e. MIPS/MHz) and so the CPU clock could be 
reduced without affecting, or only negligibly affecting, the operating performance. The advantage here is 
that the design is scalable - better performance can be achieved by simply increasing the clock rate. 
As all reads from the embedded DRAM on SoPEC produce words that are 256 bits wide it is inefficient to 
hook this up to a 32-bit CPU bus as 224 bits of each read would be discarded. If the full 256-bit word is 
rtored locally to the CPU as a single-line cache then a ??x performance improvement could be obtained in 
the typical case (this is of course highly code dependent). This single line cache would be very easy to 
implement as it would just involve the address to be compared to a single tag and no replacement algo^ 
nthm would be required. Furthermore the area impact would be minor and there should be no performance 
penalty for cache misses. As the dram_cpu_data bus is 256 bits wide the requested word is immediately 
available to the CPU i.e. we do not need to perform critical word first reordering of the data. 
The instruction cache is only accessed for instruction fetches, not all CPU reads. These can be differenti- 
ated by signals emanating from the CPU. Non-instruction CPU reads would be supported by the data 
cache. In the case of a cache miss the read request is processed by the MMU to ensure the request is valid 
before a read request is generated on the relevant external (to the CPU block) bus. The MMU should be 
informed of a cache hit to ensure it does not generate an unneccessary read request. This requires that the 
regions used to store code are aligned on 32-byte (256-bit) boundaries. 

As there is no requirement to have more time deterministic code execution the instruction cache cannot be 
disabled. 



1 1. 7. 1. 1 ICache Implementation 



The Instruction Cache used in SoPEC is capable of storing just a single 256-bit DRAM word. An imple- 
mentation IS depicted in Figure 22 below. The block I/Os are given in Table 24 and these should be viewed 
m conjunction with Figure 19 and Figure 20 for a complete depiction of the connectivity of the block. 
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Figure 22. ICache Block Diagram 



Table 24. ICache l/Os 









Global SOPEC signals 








prst_n 


1 ^ 


In 


Global reset. Synchronous to pctfr. active low. 


pdk 


1 


In 


Glot>£U dock 


Toplevel ICache signals 


dfam_cpu_data(255:0] 


256 


In 


Data bus from the DIU 


cpii_acode(1 :0] 


2 


In 


CPU access control signals 


cpu_adi(21:2) 


20 


In 


CPU core address bus. 


(Cache to DiU Bus Interface signals 


ic_cache_hit 


1 


Out 


Cache hit signal. This indicates that the current CPU read request 
Is being serviced by the ICache and so should not be retrieved from 
the DRAM; 


dram^rdy 


1 


fn 


Data Ready signal. Indicates the data on the dram cpu dafa bus Is 
valid. 


ICache to MMU Control Block signals 


{c.data(31:0] | 


32 1 


Out 1 


ICache data bus 
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Table 24. ICacheVOs 







ic_rdy 


1 


Out 


Ready signal from the ICache indicatinG the data on icjdata is valid 


<fram_access.en 


1 


Out 


Di=iAM access enalDle signal, indicates that the current CPU access 

Is a valid DRAM access. 



Description: 

The Tag stores the DRAM word address of the word cuuently in cache. The Tag contents are compared 
with cpu^adr[21:5] each time the CPU requests an instruction fetch from a valid DRAM address (indi- 
cated by cpu_acode[0] and dram_access_en). If a match occurs (i.e. a cache hit) the access is serviced by 
returning the correct 32 bits (as selected by cpu_adr[4:2]) to the MMU Control Block. If a match does not 
occur (i.e. a cache miss) the icjcachejtit line is held low indicating to the DIU Bus Interface that a 
DRAM access should commence. Completion of the DRAM access is signalled by the asseition of 
dram^rdy and this causes the ICache contents to be updated, the Tag value replaced and the relevant 32 
bits forwarded to the CPU accompanied by the assertion of the ic_rdy signal. It is updated each time the 
cache line is refilled from DRAM. All instruction fetches from DRAM are cacheable, regardless of which 
DRAM region is being accessed (although the access permissions still need to match those programmed 
for the region) and whether the CPU is in user or supervisor mode. 



11.7.2 Data Cache 



11-8 Realtiivie Debug Unit (RDU) 

The RDU facilitates the observation of the contents of most of the CPU addressable registers in the SoPEC 
device in addition to some pseudo-registers in realtime. The contents of pseudo-registers, i.e. registers that 
are collections of otherwise unobservable signals and that do not affect the functionality of a circxiit, are 
defined in each block as required. Many blocks do not have pseudo-registers and some blocks (e.g. ROM , 
PSS) do not make debug information available to the RDU as it would be of little value in realtime debug. 

Each block that supports realtime debug observation features a DebugSelect register that controls a local 
mux to determine which register is output on the block's data bus (i.e. block^cpujiata). One small draw- 
back with reusing the blocks data bus is that the debug data cannot be present on the same bus during a 
CPU read from the block. An accompanying active high block_cpu^debug_vaIid signal is used to indicate 
when the data bus contains valid debug data and when the bus is being used by the CPU. There is no arbi- 
tration for the bus as the CPU will always have access when required. A block diagram of the RDU is 
shown in Figure 23. 
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Register 



cpr_cpu_det)ug_valid 

diu_cpu_detiug.valid 

gpio_cpu_debug_valid 

icu_cpu_debug_valid 

lss_cpu_debug_valld 

pcu_cpu_debug_valld 

scb_cpu_debuo_vaIid 

tim_cpu_debug_vaad 

nvnu.debug_valid 



- cpr.cpu^dataiai.-O] 

■ diu.cpu_debug_data(31 K)] 

■ 9pio_cpu_data(31:OI 

- icu_cpu_data{31:0] 

- lss_cpu_data[31:0I 

- pcu_cpu_data[31:0] 

- 8cb_cpu_data(31:0] 

- tim_cpu_data(31:0] 

- mmu.debug_data[31:0] 



debug_cn t i 1[18:0] 




Figure 23. Realtime Debug Unit blocic diagram 



Table 25. RDU l/Os 











diu_cpu_data 


32 


In 


Read data bus from the DIU block 


cpr_cpu_data 


32 


In 


Read data bus from the CPR block 


flpio_cpu_data 


32 


In 


Read data t>us from the GPIO block 


icu_cpu_dala 


32 


In 


Read data bus from the ICU block 


lss_cpu„data 


32 


In 


Read data bus from the LSS block 


pcu_cpu_debug_data 


32 


In 


Read data bus from the PCU tilock 
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Table 25. RDU l/Os 





m, 






scb.cpu.data 

:r 


32 


In 


Read data bus fronri the SCB blcx;K 


lim_cpu_data 


32 


In 


Read data bus from the TIM block 


diu^cpu_debug_valid 


1 


In 


Signal Indicating the data on the cfiu^cpu_data bus is valid debug 
data. 


tin\.cpu_<lebug_valid 


1 


In 


Signal indicating the data on the tim^cpu^data bus is valid debug 
data. 


scbjcpu_debufl_vafid 


1 


In 


Signal indicating the data on the scbjcpu^data bus is valid debug 
data. ~ 


pcu_cpu.debug.valid 


1 


In 


Signal Indicating the data on the pcu^cpu^data bus is valid debug 
data. 


l8s_cpu_debug.valtd 


1 


In 


Signal indicating the data on the lss_cpu^data bus is valid debug 
data. 


tcu_cpu_debug_valid 


1 


In 


Signal indicating the data on the icu^cpu_data bus is valid debug 
data. 


gpio_cpu_debug_valid 


1 


In 


Signal indicating the data on the gpio_cpu_data bus is valid debug 
data. 


cpr_cpu_debug_valtd 


1 


In 


Signal indicating the data on the cpr_apu_jiata bus Is valid debug 
data. 


debug_dala_out 


18 


Out 


Output debug data to be muxed on to the PHI/GPlO/other pins 


debug.data^valid 


1 


Out 


DetMjg valid signal indicating the validity of the data on 
debug_data_oiJt, This signal is used in all debug configurations 


debug.cntrl 


19 


Out 


Control signal for each PHI bound debug data line indicating 
whether or not the debug data should be selected by the pin mux 



As there are no spare pins that can be used to output the debug data to an external capture device some of 
the existing I/Os wUl have a debug multiplexer placed in front of them to allow them be used as debug 
pins. Unfortunately many of the pins on SoPEC cannot even be multiplexed in this fashion so it will not be 
possible to output a full 32-bit debug data word every cycle. The exact number of pins available for multi- 
plexing had yet to be finalised at the time of writing. This specification assumes 20 pins will be available 
but this can easily be revised up or, more likely, down. Furthermore not every pin that has a debug mux 
will always be available to carry the debug data as they may be engaged in their primary purpose e.g. as a 
GPIO pin. The RDU therefore outputs a debug_cntrl signal with each debug data bit to indicate whether 
the mux associated with each debug pin should select the debug data or the normal data for the pin.The 
DebugPinSel is used to determine which of the 20? potential debug pins are enabled for debug at any par- 
ticular time. 

As it is not possible to output a full 32-bit debug word every cycle the RDU supports the outputting of an 
n-bit sub-word every cycle to the enabled debug pins. Each debug test would then need to be re-run a num- 
ber of times with a different portion of the debug word being output on the n-bit sub- word each time. The 
data from each nm should then be correlated to create a full 32-bit (or whatever size is needed) debug 
word for every cycle. The debug_data_valid and pclk^out signals will accompany every sub-word to allow 
the data to be sampled correctly The pclk^out signal is sourced close to its output pad rather than in the 
RDU to minimise the skew between the rising edge of the debug data signals (which should be registered 
close to their output pads) and the rising edge of pclk_out. 

As multiple debug runs will be needed to obtain a complete set of debug data the n-bit sub- word will need 
to contain a different bit pattern for each run. For maximum flexibility each debug pin has an associated 
DebugDataSrc register that allows any of the 32 bits of the debug data word to be output on that particular 
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debug dau pin. The debug data pin must be enabled for debug operation by having its corresponding bit in 
the DebugPinSel register set for the selected debug data bit to appear on the pin. 

The size of the sub-word is determined by: the number of enabled debug pins which is controlled by the 
DebugPinSel register. Note that the debug^data^valid ' signal is always output. Furthermore 
debug^cntrlfOJ (which is configured by DebugPinSel [Of) controls the mux for both the debugJUita_valid 
Bndpclk_out signals as both of these must be enabled for any debug operation. 

The mapping of debug_data_outfn] signals onto individual pins will take place outside the RDU. When 
the exact mapping has been finalised it will be recorded here. A proposed mapping is shown in Table 26 
below. 



Table 26. Example DebugPinSel mapping 







0 


phLfrdk. The dBtxjg_^dat^^vaIid signa\ will 
appear on this pin when enaUed. Enabling this 
pin also automaticaDy enables the phi.readl pin 
which will output the pcr^tit signal 


1 


phi__profile 


2 


phUsynd 


3 


test pin 1 


4 


test pin2 


5*18 


gpioC0...13] 



Table 27. RDU Configuration Registers 





m 






0x80 


DebugSfc 


A 


0x00 


Denotes which block is supplying the debug 
data. The encoding of this block is given below. 
0-MMU 
1 - TIM 

2- LSS 

3- GPIO 

4- SCB 
5 - ICU 

6- CPR 

7- DlU 

8- PCU 


0x84 


DebugPinSel 


19 


0x0^0000 


Detennines whether a pin is used for debug data 
output, A provisional mapping of pin to bit posi- 
tk)n Is given in Table 26. 
1 • Pin outputs det>ug data 
0 - Normal pin function 


0x88 to OxCC 


DebtigDataSrcN 


5 


0x00 


Selects which bit of the 32-bit debug data word 
will be outputted on debug_data_out[N] 



1 1 .9 Interrupt Operation 

The interrupt controller unit (see chapter 14) generates an interrupt request by driving interrupt request 
lines with the appropriate interrupt level. LEON supports 15 levels of interrupt with level 15 as the highest 
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level (the SPARC architecture manual (32] states that level 15 is non-maskable but we have the freedom to 
mask this if desired). The CPU will begin processing an interrupt exception when execution of the current 
instruction has completed and it will only do so if the interrupt level is higher than the current processor 
priority. If a second intenupt request arrives with the same level as an executing interrupt service routine 
then the exception will not be processed until the executing routine has completed. 

When an intenupt trap occurs the LEON hardware will place the program counters (PC and nPC) into two 
local registers. The interrupt handler routine is expected, as a minimum, to place the PSR register in 
another local register to ensure that the LEON can correctly return to its pre-interrupt state. The 4-bit inter- 
rupt level {irl) is also written to the trap type {ft) field of the TBR (Trq) Base Register) by hardware. The 
TBR then contains the vector of the trap handler routine the processor will then jump. The TEA (Trap 
Base Address) field of the TBR must have a valid value before any interrupt processing can occur so it 
should be configured at an early stage. 

Intenrupt pre-emption is supported while ET (Enable Traps) bit of the PSR is set This bit is cleared during 
the initial trap processing. In initial simulations the ET bit was observed to be cleared for up to 30 cycles. 
This causes significant additional interrupt latency in the worst case where a higher priority interrupt 
arrives just as a lower priority one is taken. 

The intenupt acknowledge cycles shown in Figure 24 below are derived from simulations of the LEON 
processor and accompanying interrupt controller. This interrupt controller will be replaced by the ICU in 
the SoPEC design. The LEON signal names are used for future reference. An interrupt is asserted by driv- 
ing its (encoded) level on the iuUrl[3:0] signals. The LEON core responds to this, with variable timing, by 
reflecting the level of the taken interrupt on the iuoArl[3:0] signals and asserting the acknowledge signal 
iuo.intack.T\i^ interrupt controller then removes the interrupt level one cycle after it has seen the level been 
acknowledged by the core. If there is another pending interrupt (of lower priority) then this should be 
driven on iuUrIf3:0J and the CPU will take that interrupt (the level 9 interrupt in the example below) once 
it has finished processing the higher priority interrupt. The iuo.irl[3:0] signals always reflect the level of 
the last taken interrupt, even when the CPU has finished processing all interrupts. 



pclk 



iui.irl(3:0] 0x0 



0x5 



0x0 



i"oJrl[3:01 [^-^V>^^^-^^^SSS1 0x5 



iuo.lntack 



iui.irl[3:0] 
iuo.irl[3:0] 
iuo.intack 



0x9 



0x8 



OxA 



0x9 



Figure 24. Interrupt acknowledge cycles for a single and pending interrupts 
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11.10 Boot Operation 

See section 17.2 for a description of the SoPEC boot operation. 

11-11 Software Debug 

I Software debug mechanisms are discussed in the "SoPEC Software Debug" document (15], 
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12 Serial Communications Block (SCB) 

12.1 Overview 

The Serial Communications Block (SCB) handles the movement of all data between the SoPEC and the 
host device (i.e. PC) and between master and slave SoPEC devices. The SCB consists of a USB 1.1 device 
contioller, an Inter-SoPEC Interface (ISI) and a DMA manager. A block diagram of the SCB is shown in 
Figure 25 below. The major blocks of the SCB, namely the ISI, USB and DMA manager, could be imple- 
mented as separate blocks but are integrated to take advantage of the performance gains and design simpli- 
fications that a tighter coupling allow. 



D-M— »^ . 



USB control 



SRAM 



4— ► 



USB 
Ck>ntroller 



SRAM/ 
Regftle 



4— ► 



'sLgpiQ_dout[l :0) 



isLgpio_e[1:0] 



gpioJsLdinn :01 



ISI 



SOB 
Control 
Block & 
DMA 
Manager 




— usb_clk 

usb_cpr_reset_n 

cpu_adrln:2] 
cpu_dataout[31 :0] 
scb_cpu_data[31 :0J 

cpu_scb_sel 
cpu_nwn 
cpu_acode[2:0J 
scb_cpu_rdy 
scb_cpu_berr 
dmajcujrq 
isi_lcu_irq 
usbjcu_irq 

scb_diu_wadr(21 :5] 
scb_diu_dataI63:0] 
5cb_diu_wreq 
diu_scb_wack 
scb_diu_wvalid 



■> scb_cpu_debug_valld 



isi_cpr_resel_n 



prst_n 
pclk 



Figure 25. Serial Communications Block 

The USB Controller will be an unported piece of IP. There are many possible sources of this block but it is 
likely that it will be supplied by the silicon vendor - all three current silicon vendor candidates will supply 
USB 1.1 controllers, although some of these have been sourced from a third party. 

The SCB can be seen in the context of the overall SoPEC device in Figure 26 below 
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eDRAM 



DIU 



Slaver 



CPU:Sut)^ystem 
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Figure 26. SoPEC toptevel blocic diagram 
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12.2 Definitions of I/Os 

Table 28. Serial Communications Block I/O 



Clocks and Resets 


prsLn 


1 


In 


System reset signal. Active tow. 


pdk 


1 


in 


System dock. . 


usb.dk 


1 


In 


aock for the USB controller block. 


lsLcpr_reset_n 


1 


Out 


Signal from the ISI Indicating that ISI activity has been detected 
while in sleep mode and so the chip should be reset Active low. 


usb_cpr_feset_n 


1 


Out 


Signal from the USB controller that a US8 reset has occurred. 
Active low. 


CPU Interface 


cpu.adi[n:2] 


n-1 


In 


CPU address bus. Exact width is currently TBD as it is dependent 
on the address maps of Imported IP * 


cpu_dataout(31:0] 


32 


In 


Shared write data bus from the CPU 


scb_cpu_data[31:0] 


32 


Out 


Read data bus to the CPU 


cpu_rvvn 




In 


Common read/not-write signal from the CPU 


cpujc[2:0] 




in 


CPU Function Code signals. 


cpu.8Gb.sel 




In 


Block select from the CPU When cpu^scb^sells high both epu adr 
and cpuLd!afaoc/f are valid 


8cb_cpu.fdy 




Out 


Ready signal to the CPU. When sofiL£pu_rdy Is high it indicates the 
last cyde of the access. For a write cyde this means cpu_dataout 
has been registered by the SCB and for a read cyde this means the 
data on scb_cpu_data is valid. 


scb_cpu_berr 




Out 


Bus error signal to the CPU indicating an invalid access. 


scb_cpu_deboflLvafid 




Out 


Signal indicating that the data currently on scb_cpu__data is valid 
debug data 


Interrupt signals 


dfna_icu_irq 




Out 


DMA interrupt signal to the interrupt controller bk»ck. 


isijcujrq 




Out 


ISl intemjpt signal to the interrupt controller block. 


usbjcujrq 




Out 


USB ^terrupt signal to the Interrupt controUer block. 


DIU interface 


scb_diu_wadff21 :51 


17 


Out 


Write address bus to the DIU 


scb_diu_dataI63.t)] 


64 


Out 


Data bus to the DIU. 


scb_diu_wreq 




Out 


Write request to the DIU 


dlu_scb_waclc 




In 


Acknowledge from the DIU that the write request was accepted. 


scb.dtu.wvalld 




Out 


Signal from the SCB to the DIU Indicating that the data cunrently on 
the scb_dtu_data{63:0] bus is valid 


GPIO Interface 


Jsi_gp(o_doijt(1:0l 


2 


Out 


tSI output data to GPIO pins 


isi_gpio_e{1 :0] 


2 


Out 


ISI output enable to GPIO pins 


flpioJsLdin(1:0J 


2 


In 


Input data from GPJO pins to ISI 
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I 12.3 MULTI-SOPEC SYSTEMS 



While single SoPEC systems are expected to form the majority of SoPEC systems the SoPEC device must 
also support its use in multi-SoPEC systems such as that shown in Figure 27 below. A SoPEC may be 
assigned any one of a number of identities in a multi-SoPEC system. A SoPEC may be one or more of a 
PrintMaster. a LincSyncMaster, an [SIMaster, a StorageSoPEC or an ISISlave SoPEC 



USB from Host ^ 




Figure 27. A3 duplex system featuring four piintingVoPEC^ with a stM^^ 
SoPEC DRAM device 



12.3.1 ISIMaster device 



The ISIMaster is the only device allowed to drive the common ISI line (see Figure 28) and interfaces 
directly with the host. In most systems the ISIMaster will simply be the SoPEC connected to the USB bus 
Future systems, however, may employ an ISI-Bridge chip to interface between the host and the ISI bus and 
m such systems the ISI-Bridge chip will be the ISIMaster. There can only be one ISIMaster on an ISI bus 



12.3.2 PrintMaster device 



The PnntMaster device is responsible for co-ordinating all aspects of the print operation. This includes 
stortmg the print operation in all printing SoPECs and communicating status back to the host. When the 
ISIMaster is a SoPEC device it is also likely to be the PrintMaster as well. There may only be one Print- 
Master in a system and it is most likely to be a SoPEC device. 
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12.3.3 UneSyncMaster device 

The LineSyncMaster device generates the isync pulse that all SoPECs in the system must synchronize 
their line outputs with. Any SoPEC in the system could act as a LineSyncMaster although the PrintMaster 
is probably the most likely candidate. It is possible that the LineSyncMaster may not be a SoPEC device at 
all - it could, for example, come from some OEM motor contxx>l circuitry. There may only be one LineSyn- 
cMaster in a system. 

12.3.4 Storage device 

For certain printer types it may be realistic to use one SoPEC as a storage device without using its print 
engine capability - that is to effectively use it as an ISI-attached DRAM. A storage SoPEC would receive 
data from the ISIMaster (most likely to be an ISl-Bridge chip) and then distribute it to the other SoPECs as 
required. No other type of data flow (e.g. ISISlave -> storage SoPEC -> ISISlave) would need to be sup- 
ported in such a scenario. The SCB si^jports this functionality at no additional cost because the CPU han- 
dles the task of transferring outbound data from the embedded DRAM to the ISI transmit buffer. The CPU 
in a storage SoPEC will have almost nothing else to do. 

12.3.5 ISISlave device 

Multi-SoPEC systems will contain one or more ISISlave SoPECs. An ISISlave SoPEC is primarily used to 
generate dot data for the printhead IC it is driving. 

12.3.6 ISI-Bndge device 

SoPEC is targeted at the low-cost small office / home office (SoHo) market. It may also be used in future 
systems that target different market segments which are likely to have a high speed interface capability. A 
future device, known as an ISI-Bridge chip, is envisaged which will feature both a high speed interface 
(such as USB2.0, Ethernet or IEEE1394) and one or more ISI interfaces. The use of multiple ISI buses 
would allow the construction of independent print systems within the one printer. The ISI-Bridge would be 
the ISIMaster for each of the ISI buses it interfaces to. 

12.3.7 Host device 

The host device will invariably be, but is not required to be, a PC. Any device that can act as a USB host or 
that can interface to an ISI-Bridge chip could be the host device. In particular, with the development of 
USB On-The-Go (USB OTG), it is possible that a number of USB OTG enabled products such as PDAs or 
digital cameras will be able to direcUy interface with a SoPEC printer. 




. 1 2.4 Types of cokaimunication 

12.4.1 Communications with host 



The host communicates directly with the ISIMaster in order to print pages. When the ISIMaster is a 
SoPEC, the communications channel is USB 1.1. 



12.4. 1. 1 Host to iSiMaster communication 

The host will need to communicate the following information to the ISIMaster device: 

• Conununications channel configuration and maintenance infonnation 

• All data destined for PrintMaster. ISISlave or storage SoPEC devices. This data is simply relayed by 
the ISIMaster 

• Mapping of virtual communications channels, such as USB endpoints, to ISI destination 
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12.4.1.2 iSiMaster to host communication 

The ISIMaster will need to communicate the following information to the host: 

• Communications channel configuration and maintenance information 

• AJl data originating from the PrintMaster, ISISlave or storage SoPEC devices and destined for the host. 
This data is simply relayed by the ISIMaster 

12.4.1.3 Host to PrintMaster communication 

The host will need to communicate the following information to the PrintMaster device* 

• Program code for the PrintMaster 

• Compressed page data for the PrintMaster 

• Control messages to the PrintMaster 

• Tables and static data required for printing e.g. dead no22le tables, dither matrices etc. 

• Authenticatable messages to upgrade the printer's capabilities 

12.4. 1.4 PrintMaster to host communication 

The PrintMaster will need to communicate the following information to the host: 

• Printer status information (i.e. authentication results, paper empty^ammed etc.) 

• Dead nozzle information 

• Memory buffer status inforaiation 

• Power management status 

• Encrypted SoPEC Jd for use in the generation of PRINTER^QA keys during factory programming 

12.4.1.5 Host to iSiSiave communication 

All communication between the host and ISISlave SoPEC devices must take place via the ISIMaster In 
the case of a SoPEC ISIMaster it is possible to configure each individual USB endpoint to act as a control 
channel to an ISISlave SoPEC if desired, although the endpoints will be more usually used to transport 
^ta. The host will need to communicate the following information to ISISlave devices over the comms/ 

• Program code for ISISiave SoPEC devices 

• Compressed page data for ISISiave SoPEC devices 

• Control messages to the ISISiave SoPEC (where a control channel is supported) 

• Tables and static data required for printing e.g. dead nozzle tables, dither matrices etc. 

• Authenticatable messages to upgrade the printer's capabilities 

12.4.1.6 iSiSiave to host communication 

All communication between the ISISlave SoPEC devices and the host must take place via the ISIMaster 
The ISISiave will need to communicate the following information to the host over the comms/ISI: 

• Responses to the host's control messages (where a control channel is supported) 

• Dead nozzle information from the ISISiave SoPEC. 

• Encrypted SoPECJd for use in the generation of PRJNTER.QA keys during factory programming 
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12.4.2 Communication over ISI 

1Z4.2. 1 iSiMaster to PrintMaster communication 

The ISiMaster and PrintMaster will often be the same physical device. When they are different devices 
then the following information needs to be exchanged over the ISI: 

• All data from the host destined for the PrintMaster (see section 12.4.1.3). This data is simnly relayed 
by the ISiMaster k/ / 

12,4.2.2 PrintMaster to ISiMaster communication 

The ISiMaster and PrintMaster will often be the same physical device. When they are different devices 
then the following information needs to be exchanged over the ISI: 

• All data from the PrintMaster destined for the host (see section 12,4.1:4). This data is simply relayed 
by the ISiMaster 

1Z4.Z3 ISiMaster to ISISIave communication 

The ISiMaster may wish to communicate the following information to the ISISlaves: 

• All data (including program code such as ISIId enumeration) originating from the host and destined for 
the ISISIave (see section 12.4.1.5). This data is simply relayed by the ISiMaster 

• wake up from sleep mode 

12.4.2.4 ISISIave to ISiMaster communication 

The ISISIave may wish to communicate the following information to the ISiMaster: 

• Ail data originating from the ISISIave and destined for the host (sec section 1 2.4. 1.6). This data is sim- 
ply relayed by the ISiMaster 

12.4.2.5 PrintMaster to ISISIave communication 

When the PrintMaster is not the ISiMaster all ISI communication is done in response to ISI ping packets 
(see 12.6.4.5). When the PrintMaster is the ISiMaster then it will of course commimicate directly with 
the ISISlaves. The PrintMaster SoPEC may wish to communicate the following infonnation to the ISISla- 
ves: 

• Ink status e.g. requests for dotCoum data i.e. the number of dots in each color fired by the printhcads 
connected to the ISISlaves 

• configuration of GPIO ports e.g. for clutch control and lid open detect 

• power down command telling the ISISIave to enter sleep mode 

• ink cartridge fail information 

This list is not complete and the time constraints associated with these requirements have yet to be deter- 
mined. 

In general the PrintMaster may need to be able to: 

• send messages to an ISISIave which will cause the ISISIave to return the contents of ISISIave registers 
to the PrintMaster or 

• to program ISISIave registers with values sent by the PrintMaster 

This should be under the control of software running on the CPU which writes messages to the ISI/SCB 
interface. 
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1Z4.2.6 iSISfBve to PiintMaster communication 

ISISlaves may need to communicate the following infonnation to the PrintMaster: 

• ink status e.g. dotCount data i.e. the number of dots in each color fired by the printheads connected to 
the ISISlaves 

• band related information e.g. finished band interrupts 

• page related information i.c.buffer undemm, page finished interrupts 

• MMU security violation interrupts 

• GPIO interrupts and status e.g. clutch control and lid open detect 

• printhead temperature 

• printhead dead nozzle infonnation from SoPEC printhead nozzle tests 

• power management status 

This list is not complete and the time constraints associated with these requirements have yet to be deter- 
mined. 

As the ISI is an insecure interface commands issued over the ISI should be of limited capability e.g. only 
limited register writes allowed The software protocol needs to be constructed with this in mind In general 
ISISlaves may need to return register or status messages to the PrintMaster or ISIMaster. They may also 
need to indicate to the PrintMaster or ISIMaster that a particular interrupt has occurred on the ISISlave. 
This should be under the control of software running on the CPU winch writes messages to the ISI block. 

12,4.2.7 iSiSlave to iSiSiave communication 

It is currently not anticipated that there will be any direct communication between ISISlave SoPECs. How- 
ever they can communicate indirectiy via the ISIMaster SoPEC. The most likely scenario for such a com- 
munication mechanism when the PrintMaster is not the ISIMaster (see sections 12.4.2,5 and 12.4.2.6 for a 
description of the infonnation exchanged between a PrintMaster and an ISISlave). ISISlave to ISISlave 
conmiunication would also be required when sending data stored in a storage SoPEC device to an 
ISISlave. 



12.5 USB 



The USBl. 1 interface for the printer should consist of the USB connector, the necessary discretes for USB 
signalling and the SoPEC device. A SoPEC printer will act as a self-powered, full-speed device and 
SoPEC itself will not draw any power from the USB cable. It wUl support control and bulk transfers. 
Intenrupt transfers are not considered necessary because the required interrupt-type functionality can be 
achieved by sending query messages over the control channel on a scheduled basis. There is no require- 
ment to support either isochronous or low-speed transfers. The USB controller must siq)port at least 5 
USB endpoints: a control endpoint (endpoint 0) and 4 bulk-data type endpoints. These 4 bulk-data type 
endpoints can be used for the transfer of any type of data: compressed page data, program data or control 
rnessages. They may also be m^ped on to any target destination in a multi-SoPEC system i.e. configura- 
tion IS completely programmable. They are envisaged as always being used as USB IN endpoints i.e. they 
will transport data finom the host to SoPEC, Any feedback data (e.g. stattjs information) will be reuimed to 
the host on the control channel (endpoint 0). 

The USB device enumeration process will be handled by the SoPEC CPU and USB controller. Note that 
this requires the on-chip ROM to contain all the required USB driver code. This is not expected to be the 
DRAM^ ^"^^^ ^^^^ ^ "USB-lite" driver that has sufficient functionality to download a program to 

Details of the configuration registers and interface signals will be provided when the implementation IP 
for the USB controller core has been selected There are several potential candidates for the USBl . 1 con- 
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troiler that are being evaluated in terms of cost, maturity, licensing requirements/restrictions, quality of 
deliverables etc. - as already mentioned the choice of silicon vendor is likely to play a large part in select- 
ing the USB controller. 

12.5.1 ISIMaster/ISISIave Identification 

While the USB controller is used for data transfer if a SoPEC is an ISIMaster it may, in certain cases, also 
be used to transfer data to an ISISlave. If the USB is not used for data transfer the device will certainly be 
an ISISlave. In this case the USB pins could be used to identify the device as an ISISlave as the USB 
device controller is expected to allow the single-ended quiescent state of the USB pins to be read by the 
CPU either directly or indirectly (as there should be a register indicating whether the USB controller is 
operating as a ftiU-speed or low-speed device). We adopt the convention that an ISIMaster SoPEC has its 
USB pins configured for full-speed operation (i.e. a pull-up resistor on D+) and an ISISlave SoPEC has its 
USB pins configured for low-speed operation (i.e. a pull-up resistor on D-). This allows the ROM boot- 
code to quickly determine whether the SoPEC is an ISIMaster or ISISlave without needing to wait for 
USB activity. While the ISISlave SoPEC's USB controller believes it is a low-speed device it is never used 
and may be disabled completely (if possible) once the device has been identified as an ISISlave. Note that 
other combinations on the D+ and D- lines may result in unreliable operation of the USB controller. 

The SoPECs identity as an ISIMaster or ISISlave may also be determined from USB or ISI activity. If 
activity is seen on USB endpoints 2-4 then the device is an ISIMaster (note that it is not necccssarily an 
ISIMaster if activity is only seen on en(^oints 0 or 1) and the ISI may automatically configure itself as an 
ISIMaster in this situation. If the ISI receives ping packets then it is an ISISlave as only the ISIMaster can 
send ping packets. 

The most suitable ISIMaster/ISISIave identification scheme (i.e. use of USB pins or looking for USB/ISI 
activity) can be chosen by the software for any given printer. 

12.5^ Wake-up from sleep mode 

The SoPEC will be placed in sleep mode after a suspend command is received by the USB controller. The 
extent of power*down in sleep mode is currently TBD (different silicon vendors offer different options) 
but it is expected to involve the loss of DRAM contents at a minimum. The USB controller (or portions of 
it) will continue to be powered and clocked in sleep mode. It is likely that a USB reset, as opposed to a 
device resume, will be required to bring SoPEC out of its sleep state as the sleep state is hoped to be logi- 
cally equivalent to the power down state. The exact reawakening mechanism will be finalised when the 
sleep state is more precisely defined and the particular implementation of the USB controller is chosen. 

The USB reset signal originating from the USB controller will be propagated to the CPR (as 
usb_cpr_reset_n) if the USBWakeupEnable bit of the WakeupEnable register (see Table 38) has been set. 
The USBWakeupEnable bit should therefore be set just prior to entering sleep mode. 

There arc no conditions that require the SoPEC to initiate a USB device wake-up (i.e. where SoPEC sig- 
nals resume to the host after being suspendied by the host). 

12.5.3 USB Speed 

The USB speed will be determined by amount of activity from other devices that share the USB bus with 
the printer and the responsiveness of the host in handling USB interrupts. To guarantee bandwidth to the 
printer it is recommended that no other devices are active on the USB bus between the printer and the host. 
If the printer is connected to a USB2.0 host or hub it may limit the bandwidth available to other devices 
connected to the same hub but it would not significantly affect the bandwidth available to other devices 
upstream of the hub. Used in the recommended configuration it is expected that an effective bandwidth of 
8-9 Mbit/s will be achieved. 



Doc: SoPEC_hardware_destgn 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 109 



SoPEC : Hardware Design 



12.6 iSi (Inter SoPEC Interface) 

The ISI is utilised in all system configurations requiring more than one SoPEC. An example of such a sys- 
tem which requires four SoPECs for duplex A3 printing and an additional SoPEC used as a storaee device 
is shown in Figure 27. 

The ISI performs much the same function between an ISISlave SoPEC and the ISIMaster as the USB con- 
nection performs between the ISIMaster and the host. This includes the transfer of all program data com- 
pressed page data and message (i.e. commands or status information) passing between the ISIMaster and 
the ISISlave SoPECs. Existing requirements indicate that it is sufficient for the ISIMaster to initiate all 
communication with the ISISlaves. 

12.6.1 iSIMaster/ISISIave identification and ISiSlave enumeration 

Section 12.5.1 details how a SoPEC is configured as an ISIMaster or ISISlave. The ISIId is established by 
software downloaded over the ISI (in broadcast mode) wUch looks at the input levels on a number of 
GPIO pms to determine the ISIId. For any given printer that uses a multi-SoPEC configuration it is 
expected that there will always be enough fi^ee GPIO pins on the ISISlaves to support this enumeration 
mechanism. 

12.6.2 Wake^up from sleep mode 

Either the PrintMaster SoPEC or the host may place any of the ISISlave SoPECs in sleep mode prior to 
gomg mto sleep mode itself. The ISISlave device should then ensure that its ISIWaksupEnable bit of the 
mikeupEnable register (see Tabic 38) is set prior to entering sleep mode. In an ISISlave device the ISI 
block will continue to receive power and clock during sleep mode so that it may monitor the gpio^isi^din 
lines for activity. When ISI activity is detected during sleep mode and the ISIWakeupEnable bit is set the 
ISI asserts the isi^cpr^reset^n signal. This will bring the rest of the chip out of sleep mode by means of a 
wakeup reset. See chapter 16 for more details of reset propagation. 

12.6.3 ISI speed 

The ISI will need to run at speed that will allow error free transmission on the PCB while minimising the 
buffermg and hardware requirements on SoPEC. While an ISI speed of 10 Mbit/s is adequate to match the 
effw:tive USBl.l bandwidth it would limit the system performance when a high-speed connection (e g 
USB2.0, IEEE1394) is used to attach the printer to the PC. Although they would require the use of an extra 
ISI-Bndge chip such systems are envisaged for more expensive printers (compared to the low-cost basic 
SoPEC powered printers that are initially being targeted) in the future. 

An ISI line speed (i.e. the speed of each individual ISI wire) of 32 Mbit/s is therefore proposed as it will 
allow ISI data to be oversampled 5 times (at 2.pclk frequency of 160MH2). The total bandwidth of the ISI 
Will depend on the number of pins used to implement the interface. The current expectation is that two 
pms will be used, giving a peak raw bandwidth of 64 Mbit/s, and this is the scenario that is used in this 
document. However the ISI protocol will work equally weU if four pins are used for transmission/rccep- 
tion and this would give a peak raw bandwidth of 128 Mbit/s. The number of pins available for the ISI is 
currently under investigation as part of the package selection process. Wth either a two or four pin ISI 
solution a 32 Mbit/s line speed would allow the movement of data in to and out of a storage SoPEC (as 
descnbed in 12.3,4 above), which is the most bandwidth hungry ISI use. in a timely fashion. 

The maximum effective bandwidth of a two wire ISI. af^er allowing for protocol overheads and bus turn- 
around times, is expected to be ^prox. 50 Mbit/s. 
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12.6.4 ISI protocol 

The ISI is a serial interface utilizing a two wire half-duplex configuration as shown in Figure 28 below. An 
ISIMaster must always be present and up to 14 ISISlaves may also be on the ISI bus. The ISI bus enables 
broadcasting of data, ISIMaster to ISISIave communication, ISISIave to ISIMaster communication and 
ISISIave to ISISIave communication. Flow control, error detection and retransmission of errored packets is 
also supported. ISI transmission is asynchronous and a Start field is present in every transmitted packet to 
ensure synchronization for the duration of the packet. Bit-stuffing is required as it is expected that synchro- 
nization cannot be guaranteed for the length of the longest allowed packet^ Open Issue: This should be 
confirmed with the spec of the crystal used with SoPEC We may wish to constrain the spec of xtalin and 
also xtalin for the ISI-Bridge chip to ensure the ISI cannot drift out of sync during packet reception. 
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Figure 28. ISI configuration with four SoPEC devices 

To maximize the efTective ISI bandwidth while minimising pin requirements a two wire half-duplex inter- 
leaved transmission scheme is used Figure 29 below shows how a 16-bit word is transmitted from an ISI- 
Master to an ISISIave. Data is interleaved on a bit-by-bit basis over the two ISI lines and this requires all 
ISI packets to be an even number of bits in length. This interleaving could easily be extended to four pins 
if required. 

All ISI transactions are initiated by the ISIMaster and every non-broadcast data packet needs to be 
acknowledged by the addressed recipient. An ISISIave may only transmit when it receives a ping packet 
(see section 12.6,4.5) addressed to it. To avoid bus contention all ISI devices must wait one bit-time (5 pclk 
cycles) after detecting the end of a packet before transmitting a packet (assuming they are required to 
transmit). All non-transmitting ISI devices must tristate their Tx drivers to avoid line contention. A pull-up 
resistor is therefore required on both ISI lines to reduce the possibility of false data detection. The ISI pro- 
tocol is defined to avoid devices driving out of order (e.g. when an ISISIave is no longer being addressed). 
As the ISI will use standard I/O pads there will be no physical collision detection mechanism. 



1. Current max packet size ~= 290 bits = 145 bits per line (on a 2 wire ISI) = 725 I6OMH2 cycles. Thus the pclks in the two communicat- 
mg ISI devices should not drift by more than one cycle in 725 i.e. 1379 ppm. Careful analysis of the crystal. PLL and oscillator specs 
and the sync detection circuit is needed here to ensure our solution is robust 
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Figure 29. Half-duplex rnterleaved transmission from ISIIVIaster to ISISIave 

There are three types of ISI packet: a long packet (used for data transmission), a ping packet (used by the 
I \!ff^' ^^"""^^^ ISISlaves for packets) and a short packet (used to acknowledge receipt of a packet) 
All ISI packets are delineated by a Start and Stop fields and transmission is atomic i.e. an ISI packet may 
not be split or halted once transmission has started. 

12.6,4.1 iSi transactions 

The different types of ISI transactions are outlined in Figure 30 below. As described later all NAKs are 
mferred and ACKs are not addressed to any particular ISI device. 



ISIMaster 



ISISIave A 




ISISIave B 



Transaction 1: Long packet to an addressed ISISIave 



ISIMaster 



ISISIave A 



ISISIave B 




Transaction 2: Ping packet to an addressed ISISIave. ISISIave has nothing to send 
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ISIMaster 



ISJSIave A 



ISIStave B 




Transaction 3: Ping packet to an addressed ISISIave. ISIStaveA responds with a long packet to 
ISISlaveB and ISlSlaveB responds with an ACK or NAK. 



ISIMaster 



ISISIave A 



ISISIave B 




Transaction 4: Ping packet to an addressed ISISIave. ISISIaveA responds with a long packet to 
the ISIMaster and the ISIMaster responds with an ACK or NAK, 



Figure 30. ISI transactions 



iZ€.4,2 Start field description and bit stuffing 

The Start field serves two purposes: To allow the start of a packet be unambiguously identified and to 
allow the receiving device synchronise to the data stream. The symbol, or data value, used to identify a 
Start field must not legitimately occur in the ensuing packet. Bit stuffing is used to guax^tee that the Start 
symbol will be unique in any valid (i.e. error free) packet. The Start symbol should therefore be suffi- 
ciendy long to ensure that the bit stuffing overiiead is low but should still be short enough to reduce its own 
contribution to the packet overhead A Start bit length of 8 bits is therefore used as it is an effective com- 
promise between these two constraints. The Start fields like every byte in a packet, is transmitted with its 
rightmost (Isb) bit first 

If the correct symbol value is used bit stuffing offers the further advantage of forcing transitions on the ISI 
lines which will allow synchronizatioQ be maintained. Unfortunately a symbol value that is good for forc- 
ing transitions (e.g. 0x00) is not good for guaranteeing initial synchronization and vice versa i.e. a symbol 
such as OxAA would ensure initial synchronization but cannot prevent synchronization being lost if a long 
rtin of zeroes or ones is subsequently transmitted. 

To resolve this conflict the Start symbol will be OxAA and three different types of bit stuffing are used. 
Whenever OxAA is encountered in the data stream a 0 is inserted before the msb resulting in the 9-bit 
value 0xl2A (i.e. blOlOlOlO -> blOOlOlOlO). To ensure transitions occur during a long run of zeroes a 1 
is inserted after 7 zeroes thus 0x00 becomes 0x080 (i.e. bOOOOOOOO -> bO 10000000). Likewise to ensure 
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transitions will occur during a run of ones a 0 is inserted after 7 ones and so OxFF becomes 0xi7F (i e 
bl 1 1 1 11 11 .> blOll 11 1 11). The receiving ISI device must detect these special values and strip out the 
inserted ones and zeroes. 

Note that any violation of bit stuffing will result in the FrameError status bit being set and the incoming 
packet will be treated as an errored packet. Furthermore if the Stcwt field is not received as OxAA the 
FrameError status bit is set and incoming data is discarded until a correct Start field is detected 
In a truly random data such a bit stuffing scheme could cause an overhead of approx. 0.15% While the 
data transmitted over the ISI will not be truly random (0x00 and OxFF arc likely to occur more often than 
they would m a random data set) the overhead should remain low and will never exceed 1 1 1% (i e 1 in 
every 9 bits). 

12.6.4.3 Stop fieid description 

A 2-bit Stop field (= bl I ) is used to ensure that both lines retum to the high state before the next packet is 
transmitted Two bits are required because the Stop field will be interleaved over both ISI lines (4 bits 
would be used in a 4 wire ISQ. The Stop field is not subject to bit stufl5ng because bit stuffing could result 
in the final transmitted bit being a 0 on one of the ISI lines, 

12.6.4.4 iSliong pacicet description 

The format of a long ISI packet is shown in Figure 31 below. Data may only be transfeired between ISI 
devices usmg a long packet as both the short and ping packets have no payload field Except in the case of 
a broadcast packet, the receiving ISI device will always reply to a long packet with either an cxpUcit ACK 
(no error detected m received packet) or an inferred NAK (an error was detected in the received packet) 
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8 bits 3 bits 5 bits 



256 bits 



16 bits 2 bits 
Figure 31. ISI long packet 

All long packets begin with the Start field as described earlier. The PktDesc field is described in Table 29. 
Table 29. PktDesc field description 



Packet type indicator: 
1 - Short packet 

0 • Non-short (i.e. long/ping) packet 
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Table 29. PktDesc field descrlpUon 



1 


m 












1 


Data paytoad present indicator 

1 • This packet contains paytoad (I.e. long packet) 

0 • This packet has no payload 


2 


Sequence bit value. Only valkJ for kmg packets. See section 12.6.4.8 tor a 
description of sequence bit operatton 



. -V — — ^ ajraiciii uittjr uausnui a long pacKCi out oiUy the ISlMaster may initiate an ISI trans- 
action using a long packet. An ISISlave may only send a long packet in reply to a ping message from the 
ISlMaster. A long packet from an ISISlave may be addressed to any ISI device in the system although the 
ISlMaster (or the PnntMaster if it is a diflferent device) will be the usual recipient. 

The Address field is straightforward and complies with the ISI naming convention described in section 

The payload field is exactly what is in the transmit buffer of the transmitting ISI device and gets copied 
mto the receive buffer of the addressed ISI device(s).When present the payload field is always 256 bits. 
To enaue strong error detecHon a 16.bit CRC is appended. This CRC is calculated over the entire packet 
(excluding the Start and Stop fields). The HDLC standard CRC-16 (i.e. G(x) = +x"+x^+I) is to be 
used for this calculation, which is to be performed serially. 

1Z6.4.S ISI ping packet 

The ISI ping packet is used to allow ISISlaves transmit on the ISI bus. As can be seen from Figure 32 
below the pmg packet is cab be viewed as a special case of the long packet. In other words it is a lone 
! payload, whose PktDesc field is always bOOO and whose ISISubld is always 1 The 

ISISubId is unused in ping packets because the ISlMaster is addressing the ISI device rather than one of 
tt! ISISlave may address any ISIId.ISISubId in response if it wishes 

ITie ISISlave wiU respond to a ping packet with either an explicit ACK (if it has nothing to send) an 
iTv, <lrtected an error in the ping packet) or a long packet (containing the data it wishes to 

send). Note that inferred NAKs do not result in the retransmission of a ping packet This is because the 
pmg packet will be retransmitted on a predetermined schedule (see 1 2.6.4. 10 for more details) 



b2 bp M^bits 1 bit 
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Figure 32. JSI ping packet 

An ISISlave should never respond to a ping message to the broadcast ISIId as this must have been sent in 
error. An ISI pmg packet will never be sent in response to any packet and may only originate from an ISl- 
Master. 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 115 



SoPEC : Hardware Design 



J3 



12.6.4.6 iSi short packet description 
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Figure 33. Short ISI packet 

12.6,4.7 Error detecaon and retransmission 

The 16-bit CRC vwU provide a high degree of error detection and the probabUity of transmission errors 
?re"Zf J T 'r '^^'ni^i- channel (i.e. PCB traces) will Lve a low wS^t Wt^r Ttr 

3^e CwSn^LT?^^^^^^ ^ ^'-P''^ retransmission mechIZ S 

the CPU from gettmg involved m error recoveiy for most errors because the probability of a transmission 
error occumng more than once in succession is very, very low in normal ciroLtance? '^™«''>« 
After each non-short ISI packet is transmitted the transmitting device will open a reply window TTie size 
of the reply wmdow will be 9 bit times (i.e. 14 bits transmitted on two wires plus 2 biJ timestoaSw for 
bus^unds and tmung differences) when a short packet is expected and'^147 bit tiSs (i1 ^^bZ 

SrpXl^^h mav bTf?r'* T°"* "^l ""^^ "'•'^^"•"S acknowl- 
edge packet (which may be either a long or short packet) before the reply window closes When detected 

^ 34 IsTZri """f ^""^ ^^'^ ''"^8 packet wi transL'itted 

tamsmimng ISI device wiU keep the transmitted packet in its transmit buffer for retransmission. If the 
^mrttrng dev.ce is the ISIMaster it will retransmit the packet immediately whiTeTS^ S^nUtt^Se 
devrce an ISISIave it will retransmit the packet in response' to the next ping it «cSl fi^m 

^^^^ K T"^"^ nrtnmsmining the packet when it receives a NAK until it either 

*^ /I ■ ^ of retransmission attempts equals the value of the NumRetries register If 

fte t^,ss,on was unsuccessful then the transmitting device sets the T^r bit in its /Ste JSer 
r^or^ " ^^'^^'^'^ «ei«" whenever it detects N^e^^x 

sZ Irl'^ '° succession. The NumRetries registers in all ISI devices should thereforeTf sS^ to thl 
Xt r±„f "^^^ ^^''^ ''""^"''^ transmission or reception of ping packed do no' 
fSsS^lV'^"^'" case of an ISI device receiving a packe! in error from 

an isisiave the NumRetnes count will be reset if it subsequently receives an error free packet from any ISI 

be seau^Slf pTth!' f. i'"' «^ctions as these are the only ones where retransmisVions will 

al^w Jnak! SZVZ.'^'V'-' r " NumRetriesCount window which would 

Aen we hSi?*^]. ^ ''^ If Ar,/mi?«/rt« is exceeded within this window 

men we have a RxError otherwise we can reset the count. '"uow 
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Note that either a transmit or receive eiror will cause the ISI to stop transmitting or receiving rcsnectivelv 
CPU mtervendon wiH be required to resolve the source of the problem and to restart tl^e ill tSS^o; 
receive operation. Transmit or receive errors should be extremely rare and th^r^eSe^l^ 
likely indicate a serious problem. occurrence will most 

U^S,^?r*?*^ ^^^"^"^ T *'<=k"°^<edged to avoid contention on the common ISI lines If an 
SSe^rSsTM^Jr^r^^^ 

12.6.4.B Sequence bit operaHon 

To ensure that communication between transmitting and receiving IS! devices is correctly ordered a 
^rS.d f!f " •'"''^J""^ P'^""' ■'^^P ^''^^^ - '^^P -ith each other IcqZ^bits 

ted sequence bit all ISI devices keep two local sequence bits, one for each ISISubld. Furthermore each ISI 
dmce maintains a transmit sequence bit for each ISIId and ISISubld it is in communicaTon For 
p«to^,sourc^ from the host (via USB) the transmit sequence bit is contained in the 7^Tu^Ep!Z 
Tp^S^rX / packets sourced from the CPU the transmit sequence bit is contained tn^ 
DiSfS^ f^ ''^f[,7^^^*='^"*^« f*"- '«==«ived packets are stored in DMAOSegBit Z 
o?f 1? t ^^l '^u'^' ^" Wts to 0 after txjset. It is the respoS- 

bjhty of software to ensure that the sequence bits of the transmitting and receiving ISI devices ^Z- 
rectly initaalised each tune a new source is selected for any ISIIdlSISubld channel. 
Sequence bits are not i^ed in all broadcast and ping packets. Each SoPEC may also ignore the sequence 
bit on either of its ISISubld channels by setting the appropriate bit in the SeauenceTo^k rSi^"^ 
sequence bit should be ignoi^ for ISISubld chamiels ftat will carry data thi "rSatefr^rj^ 
than one source and is selforderingcg. control messages. can ongmate trom more 

^hWn'r^® ?l '^'"^^ will toggle its sequence bit addressed by the ISISubld only when the receiver is 
able to accept data and receives an eiror-free data packet addressed to it The transmitting ISI d^fce wiS 
t^Xs^ir^i^: ISnd.ISISubId chamiel only when it receives a valid ACKh^d^Srfrom 

H?°*" ^T^'^^'L ^"^^"^ bit in both the transmitting and 

manner m every subsequent transmission until an eiror condition is encountered. 



Transmitting 
ISI Device 



Receiving 
ISI Device 




Figure 34. Successful transmission of two long packets with sequence bit toggling 
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When the receivrng lSI device detects an error in the transmitted long packet or is unable to accept the 
packet (because of full buffers for example) it will not return any packet and it will not toggle its local 
sequence bit An example of this is depicted in Figure 35. The absence of any response prompts the trans- 
mitting device to retransmit the original (seq==0) packet. This time the packet is received without any erron; 
(or buffer space may have been freed) so the receiving ISI device toggles its local sequence bit and 
responds wtth an ACK. The transmitting device then toggles its local sequence bit to a 1 upon correct 
receipt ofthe ACK. 



Transmitting 
ISI Device 




Receiving 
iSI Device 




Figure 35. Sequence bit operation with errored long packet 

However it is also possible for the ACK packet from the receiving ISI device to be corrupted and this sce- 
nano is shown m Figure 36. In this case die receiving device toggles its local sequence bit to 1 when then 
long packet is received without error and replies with an ACK to the transmitting device. The transmitting 
device detects an error in the ACK packet and so will not change its local sequence bit. It then retransmits 
the seq=0 long packet. When the receiving device finds that there is a mismatch between the transmitted 
sequence bit and the expected (local) sequence bit is discards the long packet and replies with an ACK. 
When the transmitting ISI device correctly receives the ACK it updates its local sequence bit to a I thus 
restoring synchronization. Note that when the SequenceMask bit for the addressed ISISubId is set then the 
retransmitted packet is not discarded and so a duplicate packet will be received. The data contained in the 
packet should be self-ordering and so die software handling these packets (most likely control messages) 
is expected to deal with this eventuality. 
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ISI Device 



Receiving 
ISI Device 




Figure 36. Sequence bit operation with ACK error 
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1Z6.4.9 Flow control 



The ISI also supports flow control by treating it in exactly the same maimer as an enor in the received 
packet. Because the SCB enjoys greater guaranteed bandwidth to DRAM than both ST^S aid 
supply flow control should not be required during normal operation. Any blockage on a DMA chaaael ^ 
soon result m the NumRetries value being exceeded and transmission to that DMA channel being l^l^d 
«'!^' « f ^•^.'l *° *f 'n^er as an error in the received packet neither the Lsmiu 
tmg nor the recemng ISI device will be able to difleientiate the cause of a TxEnor or RxErrvr. 

12.6.4.10 Auto-pIng operation 

^^I^^J^lt'Jf^^^'"" '^tT'' r "^'^^ the appropriate header to the 

CPUISITxBu^ntri register it is expected that all ping packete will be generated in the ISI itself The use 
of automatically generated ping packets ensures that ISISlaves will be given access to the ISI bus with a 
programnuble minimum guaranteed frequency in addition to whenever it is idle. Five registers facUitate 
rfr^T^l^-T^T^^ « messages within the ISI: PingScheduleO. PingSchedulel. PingScheduU2 
ISrrota Penod mA lSILocalPenod. Auto-pinging can be enabled or disabled by writing to ^e AutoP^- 
gEnable hit of Hie rSICna-lregisXer. * /i-wi-w 

Each bit of the l4-hit PingScheduleN register corresponds to an ISIId that is used in the Addr^s field of 
the pmg packet and a 1 m the bit position indicates that a ping packet is to be generated for that ISUd. A 0 
in any bit position wiU ensure that no ping packet is generated for that ISIId As ISISlaves may differ in 
their bandwidUi requirement (particularly if a storage SoPEC is present) three different PingSchedule reg- 
l^f^ ^ "P '° times the number of pings as another active 

ISISIave. \X^en the ISIMaster is not sending long packets (sourced from cither the CPU or USB in the 
A IS MasteO ISI ping packets will be transmitted according to the pattern given by the ^ 

P,/,^5fcA«rf«fe;^ registe«. "Hie ISI will start with the Isb of PingScheduleO register and work its way from 

m S^rLtHf ? ^C^Tt^''"!"^"?^*"- "^^^ of PingSchedule2 is reached the 

iciSSregi*« ^'^ continues to cycle through each bit position of each Ping- 

^?Rr^'in^rl,°,^'''^°'^^^°^^'^^ "''^ ^''^ P°**^"tial sources of packets in an ISIMaster 
,c w auto-ping. Arbitration between the CPU and USB for access to the ISI is handled 

outeide the ISI (see section 12.7.7) but arbitration between auto-ping packets and CPU/USB originating 
packets, which we will refer to as local packets, happens within the ISI. To ensure that local pactets get 
pnonty whenever possible and that ping packets can have some guaranteed access to the ISI we use two 4- 
Sl'^T^iV^ *? contained in the ISITotalPeriod and ISILocalPeriod registers. As we will 

T w ^ io^*^.!^^ transaction is initiated by the ISIMaster transmitting either a long packet or a ping 
packet. The /OTbr«/ft„W counter is decremented for every ISI transactioS when contention 00^(1" 
both a ping and a local packet wish to transmit) while the ISILocalPeriod counter is decremented for every 
local packet that is transmitted. Neither counter is decremented by a retransmitted packet. 
The amount of guaranteed ISI bandwidth allocated to both local and ping packets is determined by the val- 
the tnl '^f'^'f "'"^ ISILocalPeriod registers. Local packets will always be given priority when 
the ISILocalPenod ITo^I,"^ non-zero. Ping packets will be given priority when the ISILocalPeriod 
counter is zero and ±^ ISITotalPeriod counter is still non-zero. Both the ISITotalPeriod and ISILocalPe- 
rwrf counteni are reloaded by the next local packet transmit request after the ISITotalPeriod counter has 
reached zero. This reload policy minimises the maxiinum latency for ping packets at the expense of maxi- 
mum latency for local packets. 

Note that ping packets are quite likely to get more than their guaranteed bandwidth as they will be trans- 
mmed whenever the ISI bus is idle (i.e. no pending local packets) and so do not decrement cither counter 
Local packets on the other hand will never get more than their guaranteed bandwidth because each local 
rm decrements both counters. The difference between the values of the ISITotalPeriod and 
^-y^^-^^/^e^'o^ registers determines the number of auto matically generated ping packets that are guaran- 
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teed to be transmitted every ISITotalPeriod number of ISI transactions. If the ISITotalPeriod and ISILo- 
calPenod values are the same then the local packets will always get priority and could totally exclude ping 
packets if the CPU always has packets to send. 

For example if ISITotalPenod = OxC; ISILocalPeriod = 0x8; PingScheduleO = 0x07; PingSchedulel = 
0x06 and PingSchedule2 - 0x04 then four ping messages are guaranteed to be sent m every 12 ISI transac- 
tions. Furthermore ISIId3 wiU receive 3 times the number of ping packets as ISIdl and ISfldZ wiU receive 
twice as many as ISIdl. Thus over a period of 36 contended ISI transactions (allowing for two full rota- 
tions through the three PingScheduleN registers) when local packets are always pending 24 local packets 
will be sent, ISIdl will receive 2 ping packets. ISId2 will receive 4 pings and ISId3 will receive 6 ping 
packets. If local traffic is less frequent then the ping frequency will automatically adjust upwards to con- 
sume all idle ISI bandwidth. 



12.6.4. 11 IISI Registers 

Table 30 below detaUs the ISI configuration registers. Note that some of these registers are also used 
I other blocks in the SCB. 



Table 30. ISI configuration registers 







wm 






0x00 


JSICntrl 




1 0)5 


ISI Controf register 


0x04 


ISIId 


4 


0x1 


ISUd for this SoPEC. A value of 0 indicates the 
devfce Is an ISIMaster. Note that tho SoPEC resets 
to being an ISISIave and that OxF (the broadcast 
ISIId) is an jUegal value and should not be written to 
this register 


0x08 


NumRetries 


4 


0x02 


Number of retransmissions to attempt in response to 
a NAK before aborting a long packet transmission 


OxOC 


ISIPingScheduleO 


14 


0x0000 


Denotes which ISIIds will be receive ping padcets. 
Note that bItO refers to ISildl, bitl to ISnd2...bit13 to 
ISIId 14. 


0x10 


ISIPtngSchedulel 


14 


0x0000 


As per PingScheduleO 


0x14 


ISlPjngScheduIe2 


14 


0x0000 


As per PingScheduleO 


0x16 


ISITotalPeriod 


4 


OxF 


Reload value of the ISITotalPeriod counter 


0x1 C 


ISILocal Period 


4 


OxF 


Reload value of the ISlL^ocalPeriod counter 


0x20 


ISIStatus 


6 


0x00 


ISI Status register. This register is Readonly. 


0x24 


ISIMask 


6 


0x00 


ISI Interrupt Mask register 


0x30 - 0x4C 


CPUISITxBuff 


32 


r^a 


32-byte CPUISI transmit buffer 


0x50 


CPUISITxBuffCntn 


13 


0x0000 


Control register for the CPUISI transmit buffer 


0x60 • 0x7C 


CPUISIRxBuff 


32 


n/a 


32-byte ISI receive buffer This is the half of the dou- 
ble buffer that contains the oldest data. 


0x80 


IStRxBuffDest 


1 


0x0 


Only one of the CPU and the DMA manager Is 
allowed to empty the receive buffer at any time. 
1 « CPU will empty the receive buffer 
0 = DMA manager will empty the receive buffer 



12.6.4.11.1 ISI control register 

The ISICntrl register is described in Table 31 below. Note that the reset value of this register allows the 
SoPEC to automatically become an ISIMaster {AutohfasterEnable « 1) if any USB packets are received on 
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endpoints 2-4. On becoming an ISIMaster the register is set to 0, the TxEnable bit of the A5/r«/r/ r^o 
ister^ set and any USB or CPU packets destined for other ISI device's are JLl^n^^Tc^^ JZ 
K !f?*J*^^o,f ^ ""^^^ AutoMasicrEnable bit. Automatic ping ope«tion^3y 

L^rra^ti*.Tc^^^ 

Table 31. ISICntrl register 



TxEnabte 


0 


Enables ISl transmission of long or ping packets. This is cleared by 
transmit errors and so needs to be restarted by the CPU. Note that 
ACKs may still be transmitted when this bit is 0. 
1 a Transmission enabled 
0 s Transmission disabled 


RxEnable 


1 


Enables ISI reception. This Is cleared by receive errors and so 
needs to be restarted by the CPU. 
1 = Reception enabled 
0 s Reception disabled 


AutoPingEnable 


2 


Enables auto*ping operation 
1 » auto-ping enabled 
0 s auto-ping disabled 


AutoMasterEnaUe 


3 


Enables the devtce to automattcany become the ISIMaster if activ- 
ity is detected on USB endpoints2-4. 
1 sautOTmaster operation enabled 
0 = auto-master operation disabled 



12.6.4.11.2 ISI status register 



It'cSTv'^Sf ' 'f 'V^J' ^"^"^ occurring and 

ar^cleared by wntmg to either the licEnable or RxEnable bits of the ISICntrl register or the CPUISITx- 



Table 32. ISIStatus register 









FrameError 


0 


Framing error detected in the received packet. This can be caused 
by an inconrect Sea/tor Stop field or by bit stuffing errors 


RxError 


1 


A CRC enror or flow control .condition was detected in NutnRe- 
tiies^y successive packets (excluding ping packets) 


RxBuffFull 


2 


There is no space remaining in the receive double buffer 


RxBuffOverfiow 


3 


An overflow has occurred in the ISI receive buffer and a packet had 
to be dropped. 


CPUISITxBuffEmpty 


4 


The CPUlSITxBuff Is empty 


TxError 


5 


Transmission error. Receiving ISI device would not accept the 
transmitted packet. Only set after NumRetries unsuccessful 
retransmissions (excluding ping packets). 
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ISI mask register 

An intemipt will be generated in an edge sensitive manner i.e the ISI will generate an isi icu in pulse 
eachtimeastatusbitgoeshighandtheconespondingbitofthe/5/M«*registerisenabledr ~ 
Table 33. ISIMask register 











U 


interrupt enable mask bit tor the FirameError status bit 


RxErrorlntEn 


1 


Interrupt enable mask bit for tho RxError status Wt 


RxBuffFufllntEn 


2 


Interrupt enable mask bit for the RxBuffFull status bit 


RxBuffOvBrflowlntEn 




Interrupt enable mask bH for the RxBuffOverflow status bit 


CPUISITxBuffEmpty- 
IntEn 


4 


Interrupt enable mask bit for the CPUISITxBuffEmpty status bit 


TxErrorlntEn 


5 


Interrupt enable mask bit tor the TxError status bit 



12.6.4.11.4 CPUISITxBuffCntrl register 

■nt CPUIsnxBuffCntrl register contains the header field for the packet in the CPUISI transmit buffer 
Wnting to «h« bjJffer validates the contents of the CPUISI transmit buffer i.e. each time the CPU places a 
Sf!?* c^?! ♦'ansmit buffer it must write the packet header to this register to initiate its transfer in 

to the SCB transmit buffer (see section 12.7). Note that the CPU is responsible for toggling the sequence 
bit of any long packets it wishes to transmit. The CPUISITxBuffEmpty status bit will be set when CPUTx- 
FktSae bytes have been transferred to the SCB transmit buffer. 



Table 34. CPUISITxButfCntrl register 





mm. 




PktDesc 


2:0 


PktDesc field (as per Tabfe 29) tor the packet cun^enUy in the CPU- 
ISI transmit buffer. 


OestlSISubId 


3 


Indicates whteh OMAChannel of the target SoPEC the data In the 
CPUISI transmit buffer is destined for: 

0 = DMAChannelO 

1 ^ OMAChannell 


OestlSlld 


7:4 


Denotes the iSlld of the target SoPEC as per Table 3S 



12.7 SCB Mapping 



In order to support maximum flexibility when moving data through a multi^SoPEG system it is possible to 
map any USB endpoint onto either DMAChaimel within any SoPEC in the system. A logical view of the 
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SCB is Shown in Figure 37. This view diflfers from the likely implementation but it allows for a clearer 
depiction of data movement within the SCB, ^icarer 



SCB 



CPUISI 
TxBuffer 



USB 
Host 



USB 

Controfler 



SCB 

Control 

Block 



CPU Subsystem Bus 



SCB 

TxBuffer 



SCB 
Map 



CPU 



DMA 
Manager 



Ch4nnel\ 



Rx 



ISI 



DIU 



isLdin 



IsLdout 



Figure 37. SCB logical view 

The SCB map, and indeed the SCB itself is based around the concept of an ISIId and an ISISubId Each 
SoPEC m the system has a unique ISIId and two ISISubIds» namely ISISubldO and ISISubldl We use the 
^t^xJa^T'' ISISubldO corresponds to DMAChannelO in each SoPEC and ISISubldl corresponds to 
D^4AChannelI. The naming convention for the ISHd is shown in Table 35 below and this would cone^ 
spond to a multi-SoPEC system such as that shown in Figure 27. We use the tenn ISIId instead of SoPE- 
CId to avoid confusion with the unique ChipID used to create the SoPEC id and SoPECid key fsee 
chapter 17 and [9] for more details). " ^ 



Table 35. iSlfd naming convention 



0 


ISIMaster (typtcafty a SoPEC cx)nnected to the host via USB1.1) 


1 -14 


ISfSlave1-14 


15 


Broadcast ISIId 



r*x^A r Liicrciorc aiiow us to aaoress any UMAChannel in the system. The ISI. 

DMA manager and SCB map hardware use the ISIId and ISISubId to handle the different data streams that 
are active in a multi^SoPEC system as does the software running on the CPU of each SoPEC. In this docu- 
ment we will identify DMAChannels as ISlxy where x is the ISIId and y is the ISISubId Thus ISI2 1 
refers to DMAChannell of ISISlaveZ Any data sent to a broadcast channel, i.e. ISI15.0 or ISI15 1 are 
received by every ISI device in the system including the ISIMaster (which may be an ISI-Bridge). 
The USB controller and software stacks however have no understanding of the ISIId and ISISubId but the 
bilverbrook pnnter driver software running on the host PC does make use of the ISIId and ISISubId USB 
IS simply used as a data transport - the mapping of USB endpoints onto ISIId and SubId is communicated 
from the host PC Silverbrook code to the SoPEC Silverbrook code through USB control (or possibly bulk 
data) messages i.e. the mapping information is simply data payload as far as USB is concerned. The code 
runmng on SoPEC is responsible for parsing these messages and configuring the SCB accordingly 
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l^lr^JIjf ^^^S!^^^ P'^*^ «™«^ons on what can be achieved without software 
S^rrnl. ^"'^^S " ■""^•^ P°*^°**^ of data than there are sinks 

t^on^Tl? ^Sf,'^^^ «f ^« ^o-^trol and data messages from the ISIMaster SoPEC in addi- 
Lro^L^^?? r "I"!!; specifically addressed to that particular ISISlave or over the 

broadcast ISI channel. However all ISISlaves only have two possible data sinks, i.e. the two DMAChan 

^ ^StUr'"''!'' *' '^"^'^r " " ""^^-SoPEC system wWch may deceive c^^ol mCs^s 
5^.h«^M. " °^ infonnation from the host (e.g. over USB). In this case all 

aHn^rrtl'"f1f^" DMAChannelO. We resolve these potential conflicts 
by adopting the following conventions: 

^^S^Sk^T^*- """^ ^* interleaved in a memory buffer: The memory buffer that the 
DMAChannc 0 pomts to should be regarded as a central pool of control messages. Every control 
message must contain fields that identify the size of the message, the source and the destination of 
the control message. Control messages may therefore be multiplexed over a DMAChannel wUch 
allows several control message sources to address the same DMAChannel. Furthermore, if SoPEC- 
type control messages contain source and destination fields it is possible for the host to send control 
messages to mdividual SoPECs over the ISIIS.O bioadcast channel. 

^ ^ """"m T ^! » "^-o-T buffer: As data messages are typically 

part of a much larger block of data that is being transferred it is not possible to control their contente 
m ftc same n^er as is possible with the control messages. Furthermore we do not want the CPU 
o have to perform reassembly of data blocks. Data messages from different sources cannot be inter- 
leaved over the same DMAChannel - the SCB map must be reconfigured each time a different data 
source is given access to the DMAChannel. 

3) Every reconfiguration of the SCB map requires the exchange of control messages: The only 
acuve SCB map m a multi-SoPEC system is the SCB map in the ISIMaster as all ISISlaves auto- 
matically send data addressed to themselves to either DMAChannelO or I i.e. the ISI is the only 
source of incoming data in an ISISlave. The ISIMaster's SCB map reset state is shown in Figure 39 
and any subsequeiM modifications require the exchange of control messages between the ISIMaster 
and the host As the host is expected to control the movement of data in any SoPEC system it is 

S''?S,t?s''T^l^'^^" V}^ ^" performed in response to a request from the 

host. While the ISIMaster could autonomously reconfigure the SCB map (this is entirely up to the 
software running on the ISIMaster) it should not do so without informing the host in orxier to avoid 
data being nusrouted. 

An example of the above conventions in operation is worked through in section 12.7.2. 
12.7.1 Host PC to ISIMaster SoPEC communication 

When eonsid«ing SCB map configurations we always assume that the ISIMaster is a SoPEC device in 
f^^*; rf ^? c?^^"* '° ^'^'^ ^"^^'^^S data on USB endpoint 2. 3 or 4). rather than 

^f«™ftr„? !tl m' u '^^^^ "■"'^'^ '° something similar to an SCB map and the following 

information should broadly apply to an ISI-Bridge but we focus here on an ISIMaster SoPEC for clarity. 

As the ISIMaster SoPEC represents the printer on the USB bus it is required by the USB specification to 
have a dedicated control endpoint, EPO. At boot time the ISIMaster SoPEC will also requirTa bulk data 
endpoint to facilitate the transfer of program code from the host PC. The simplest SCB map configuration 
..e. for a single stand-alone SoPEC. is sufficient for host to ISIMaster SoPEC communication and is showil 
" foA ,38-^ In this configuration all USB control infonnation exchanged between the host and SoPEC 
IT^^n -"^^^ bidirecUonal USB endpoint). SoPEC specific control information (printer sta- 
tus, DNC info etc.) is also exchanged over EPO. 

All packets sent to the host from SoPEC over EPO must be written into the EPO FIFO by the CPU All 
packets sent from the host to SoPEC can be placed in DRAM by the DMA Manager (as is usually the 
case) or read directly by the CPU. This asymmetry is because in a multi-SoPEC environment the CPU will 
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need to examine all incoming control messages (i.e. messages that have airived over DMAChannelO^ tn 

the adiuonal overhead m havmg the CPU move the short control messages to the EPO FIFO is Sl^veW 
^aU. Furthermore we wish to avoid making the SCB more complicated1.an necessary. S 
there is no sigmficant performance gain to be had as the control traffic will be relativ^ low banS 
S^uS^ri^M appropriate for the types of communication outlined in sections 12.4.1.1 
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Figure 38. Single SoPEC SCB map configuration and dataflow 

12.7.2 Broadcast communication 

."y^S^^" broadcast communication is shown in Figure 39. This particular configuration is 
also the default, post power-on reset, configuration for the ISIMaster SoPEC. USB endpoints Epf^ EP3 
arc m^ed onto ISISublDO and ISISubldl of I^^^^ 

trol messages as before and EPl is a bulk data cndpoint for the ISIMaster SoPEC. Depending on whL"s 

k L naV oJ^fn H f compressed page or other program downloads iLr^this reason 

SiaH^i t ^ '^^P the USB device configuration will take place, as it 

always must, by exchanging messages over the control channel (EPO). h . « u 

One possible boot mechanism is where the host PC sends the bootloaderl program code to all SoPECs bv 

S^SeTslMr Tpp;^'=1'°'^J ^ ^^^^^ "^'^ authenticates and e;Sutes the bootloaderf p^^^ 
^am. The ISIMaster SoPEC d,en polls each ISISlave (over the ISIx.O cham.el). Each ISISlave ascei^ins 

ba^? t^e IS^^^^^^^ "'"f -P-'ti^S its presence and 

S^h. h I ^ ISIMaster then passes this infomiation back to the host over EPO. Thus 

boa the host and the ISIMaster have knowledge of the number of SoPECs. and their ISIIds. in the syste^ 

SoPEC system. Thjs could mvolve s.mphfymg the default configuration to a single SoPEC system (Figure 
38) or remappmg the broadcast channels onto DMAChannels in individual ISISlaves. 
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Figure 39. DefauH SoPEC SCB map configuration and dataflow 

The following steps are required to reconfigure the SCB map from the system depicted in Figure 39 to one 
where EP3 IS mapped onto ISIl.O: ^ s 

1) The host PC sends a control message(s) to the ISIMaster SoPEC requesting that USB EP3 be 
remapped to ISIl.O 

2) The ISIMaster SoPEC sends a control message to the host PC informing it that EP3 has now been 
mapped to ISIl.O (and diercfore the host knows that the previous mapping of ISI15.1 is no longer 
available through EP3). \ * 

3 ) The host may now send control messages directly to ISISlavel without requiring any CPU interven- 
tion on the ISIMaster SoPEC 

12.7.3 Host PC . ISISIave SoPEC communication 

^ T ^t!?"^^ P''^-^^* opposed to post-reset) SCB map configuration for an ISISIave SoPEC is to have 
all USB endpomts unconnected. The ISI automatically forwards any data addressed to it (including broad- 
cast data) to the DMA with the appropriate ISISubld. If the ISIMaster is configured correctly (e g when 
Uie ISIMaster is a SoPEC, and that SoPEC^s SCB map is configured correctly) then data sent from the host 
destmed for an ISISIave will be transmitted on the ISI with the correct address. If the ISISIave has data to 
send to the host it must do so by sending a control message to the ISIMaster identifying the host as the 
intended recipient It is then the ISIMaster's responsibility to forward this message to the host. 
With this configuration the host can communicate with the ISIsiave via broadcast messages only and this 
IS the mechanism by which the bootloaderl program is downloaded. The ISISIave is unable to communi- 
«jte with the host (or the ISIMaster) until the bootlloaderl program has successfidly executed and the 
ISISIave has determined what its ISIId is. After the bootloaderl program (and possibly other programs) 



Doc: SoPEC_hardware_deslgn 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 126 



SoPEC : Hardware Design 



S5 



/^ communication from an ISISlave to host is achieved by sending messages via the ISIMaster The 
ISrslave can never .mUate communication to the host. If an ISISlave i^shes to send a messageTthe ^st 
It may do one of two thmgs: (a) wait until it is polled by the ISIMaster or (b) indicateTn SIl actoow^ 
edgement packet (sent m response to the reception of an ISl packet specifically addressJS o tha IsS 

h„2 S ^ '"'^'^ d^^tmation and wUl then copy it into the EPO HFO for transmissionTt^ 
host. The software nmmng on the ISIMaster is i«sponsible for any arbitration between from if! 

fercnt sources (mcludmg itselQ that are all destined for the host. messages irom ait 

TJe above mechanisms are appropriate for the types of communication outlined in sections 12.4.1.5 and 

12.7.4 ISIMaster - ISISlave communication 

AU ISIMaster - ISISlave communication takes place over die ISI. Immediately after reset this can only be 
by means of broadcast messages. Once the bootloaderl program has successfaUy ex«cSonXsoPEr! 
m a multi-SoPEC system the ISIMaster can communicate vSTach SopS^TS^J^rb^L 

de^^ iJS^! mil "^"T" intetpret Ae message to 

determine if fte message contains mfoimation required to be sent to the host. In the case of the ISIMaster 

the" oJ. ^P™"""^ FIFO for i^L^on to 

12.4.$r'' ^ appropriate for the types of cormnunication outlined in sections 12.4.2.3 and 



12.7.5 ISISlave 



12.7.6 



ISISlave communication 

ISISlave to ISISlave communication is expected to be limited to two special cases: (a) when the PrintMas 
Z:::ZS'r:^rr ^ ^^-^e SoPEC is used. When^^ie PrintMasiJist:: tt SSS^r 

vZl ^ c ' - messages (and receive responses to these messages) to other ISISlaves 

I^Siave^"^' " ^I^T '° in the U em. All ISIsSJe to 

ISISlave commmiication will take place in response to ping messages from the ISIMaster. 

SCB Map configuration registers 

bSs^^f .^h i'r^^TT^ ""^P"'"^. ^'"'P''^' ^ '^^ ^'"^ ™5 performed on a endpoint 

o^to al^ . ^ ^ «=°nfig^tion register to allow its data sink be selected. Mapping an endjoint 

Ae lort^rt T ""^ '^'^ ''^^ ■ ^^^-^ sink needs to be enabLd by writiSgTo 

the appropnatc configuration registers in the USB controller/ ISI /DMA manager. 

Table 36. SCB Map configuration registers 



^^^^^^ 






m 




0x100 


USBEPODesl 


7 


0x20 


This register determtnes which of the data sinks the 
data arriving in EPO should be routed to 


0x104 


USBEPIDest 


7 


0x21 


Data sink for USB EPI 


0x108 1 


US8EP2Des1 


7 


0x3E 


Data sink for USB EP2 
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Table 36. SCB Map configuration registers 




l^l^^r^iJf T '^"^^ programmed with 0x20 and 0x21 (for ISIOO and 

ISIOJ) respectively to ensure date amving on these endpoints is moved dire^^^ 



Tabfe 37. USBEPnOest register 











DestlSISubId 


0 


m^Z^^o^ DMAChannel of the target SoPEC the endpoint 

0 a DMAChannefO 

1 = DMAChannell 


OestlSltd 


4:1 


Denotes the ISlId of the target SoPEC as per Table 35 


ChannerEn 


5 


Enable bit for the DMAChannet: 

0 = Channel disabled 

1 = Channel enabled 


SequehceBrt 


6 


Sequence Iwt for packets going from USBEPn to OestlSKd.Desil- 
SISubld. Every CPU write to this register initialises the value of the 
sequence bit and this is sul)sequentl/ updated by the ISI after 
every successful long packet transmission. 



Ae^S?s!SS^ should as many USB endpoints. under the control of the host, as are required for 
*emult.-SoPEC system ,t ,s part of. As already mentioned this mapping may be dynamically'i^l^S^g- 



12.7.7 SCB transmit buffer arbitration 

T^tZ^'' been emptied the SCB control logic will immediately seek to refUl it. 

As there may be data waiting in a USB endpoint FIFOs and in the OPI n«!r trln^fT T ? 

the ChannelEn bit of the USBEPnDest register) with data already in their associated endpoint FIFOs or 
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short packets being sent on the USB. Care should be taken to ensure that the USB bandwidth is efficiently 
utilised at all times. 

12.7.8 SCB Control Block 

The SCB control block is responsible for coordinating access to and between the various sub-blocks in the 
SCB. This includes translating between the CPU subsystem bus and the USB native bus protocol, moving 
data from the USB endpoint FIFOs into the SCB transmit buffer, moving data from the CPUISl'transmit 
buffer into the SCB transmit buffer and arbitrating between the CPU and itself for access to the SCB sub- 
blocks. 



Table 38. SCB control block configuration registers 











mm 




0x120 


WakeupEnatile 


2 


0x0 


This register Is used to gate the propagation of tiie 
USB and tSi reset signals to tite OPR biode Active 
high. 

Wal«eUpEnabre{0]: cisb.cpr.reseLn control 
WakeUpEnable(1]: isLcpr^reset^n controt 


0x124 


SCBTxBuffArb 


2 


0x0 


Determines wtiich source has priority when conten- 
tion arises in filling the SCBTxBuffer. When a bit is 
set priority is gh/en to the relevant source. 
SCBTxBuffArt)[0): CPU priority 
SCBTxBijffArb[1]: USB priority 


0x128 


SCBDebugSel 


10 


0x000 


Contains address of the register selected for debug 
observation as it would appear on cpu_adr(1 1 :2} 
The contents of the selected register are output in 
the scb_cpu_data bus while cpoLSc6_se/ Is low and 
scb_cpujcSebug_valid \^ asserted to indicate the 
debug data Is valid. 

It Is expected that a number of pseudo-registers will 
be made avatiat>le for debug observation and these 
will be outlined with the implementation details. 



12.8 DMA Manager 

The DMA manager manages the flow of data between the SCB and the embedded DRAM. Whilst the 
CPU could be used for the movement of data in a USBl . 1 enabled SoPEC a DMA manager is a more effi- 
cient solution as it will handle data in a more predictable fashion with less latency and requiring less buff- 
ering. Furthermore a DMA manager is required to stipport the ISl transfer speed and to ensure that the 
SoPEC could be used with a high speed ISI-Bridge chip in the future. 

The DMA manager uses two independent channels, one for each ISISubId, to control the movement of 
data. Both DMAChannels only support write operation and can transfer data from any USB endpoint and 
from the ISI receive buffer. Data is moved at the soonest opportunity to do so and is always moved in 256- 
bit slices as required by the DIU. When it is not possible to use a 256-bit slice of data (e.g. at the end of a 
packet or for a short packet) the DMA manager will still use 256-bit access to the DIU. This means that for 
a DIU write (data incoming to the SoPEC) the DMA manager will pad the valid data with zeroes until a 
256-bit slice has been filled. 

The DMA manager handles all issues relating to byte/word/longword address alignment, data endianness 
and transaction scheduling. It arbitrates between data arriving from.thc ISI and data arriving from a USB 
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endpoint on a round robin basis. The greater guaranteed bandwidth available to the DMA manager (50 
Mbit/s at the time of writing but this may need to be increased especially if a 4-wit« ISI bus is used See 
section 20.6 for more details) ensures that the DMA manager is non-blocking. 

While the DMA manager performs the work of moving data the CPU controls the destination and relative 
timing of dataflows to and from the DRAM. The management of the DRAM data buffers requires the CPU 
to have accurate and timely visibility of both the DMA and PEP memory usage. In other words when the 
PEP has completed processing of a page band the CPU needs to be aware of the fact that an area of mem- 
ory has been freed up to receive incoming data. The management of these buffers may also be performed 
by the host. 

12.8.1 Circular buffer operation 

The DMA manager supports the use of circular buffers for both DMAChannels. Each circular buffer is 
controlled by 5 registers: DMAnBottomAdr, DMAnTopAdn DMAnMaxAdr, DMAnCurrWPtr and DMAnln- 
tAdr, The operation of the circular buffers is shown in Figure 40 below. 




DMAnTopAdr 
-4— DMAnlntAdr 



4— DMAnCurrWPtr 



DMAnTopAdr 



pSiiflpp 



iJifeilllilLil 



p DMAnMaxAdr 
DMAnBottomAdr 




(a) 

Key: | | Free buffer space 



DMAnMaxAdr 
4— DMAnlntAdr 



DMAnCurrWRr 



k— DMAnBottomAdr 



(b) 



IfiJ Filled buffer space (unprocessed data) 

Buffer space filled since last write to the DMAnlntAdr/DMAnMaxAdr registers 

Figure 40. Circular buffer operation 

Here we see two snapshots of the status of a circular buffer with (b) occurring sometime after (a) and some 
CPU writes occurring in between (a) and (b). These CPU writes are most likely to be as a result of a fin- 
ished band mterrupt (which frees up buffer space) but could also have occurred in a DMA interrupt service 
routine resulting from DMAnlntAdr being hit The DMA manager will continue filling the free buffer 
s^e depicted m (a), advancing the DMAnCurrWPtr after each write to the DIU. Note that die DMACur- 
rWPtr register always points to the next address the DMA manager will write to. When the DMA manager 
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reaches the address in DMAnlntAdr (i.e. DMACurrWPtr = DMAnlntAdr) it will generate an interrupt if the 
DMAnlntAdrMask bit in the DMAMask rc&stcr is set. The purpose of the DMAnlntAdr register is to alert 
the CPU that data {such as a control message or a page or band header) has arrived that it needs to process. 
The interrupt routine servicing the DMA interrupt will change the DMAnlntAdr value to the next location 
that data of interest to the CPU will have arrived by. 

In the scenario shown in Figure 40 the CPU has determined (most likely as a result of a finished band 
mtemipt) that the filled buffer space in (a) has been freed up and is therefore available to receive more 
data. The CPU therefore moves the DMAnMaxAdr to the end of the section that has been freed up and 
moves the DMAnlntAdr address to an appropriate offset ftom the DMAnMaxAdr address. The DMA man- 
ager continues to fill the free buffer space and when it reaches the address in DMAnTopAdr it wraps around 
to the address in DMAnBottomAdr and continues from there. DMA transfers will continue indefinitely in 
this fashion until the DMA manager reaches the address in the DMAnMaxAdr register. 

The circular buffer is initialised by writing the top and bottom addresses to the DMAnTopAdr and DMAn- 
BottomAdr registers, writing the start address (which does not have to be the same as the DMAnBottomAdr 
even though it usually will be) to the DMAnCurrWPtr register and appropriate addresses to the DMAnln- 
Udr and DMAnMaxAdr registers. The DMA operation wiU not commence until a 1 has been written to the 
relevant bit of the DMAChanEn register. 

While it is possible to modify the DMAnTopAdr and DMAnBottomAdr registers after the DMA has started 
it should be done with caution. The DMAnCurrWPtr register should not be written to while the 
DMAChannel is in operation. DMA operation may be staUed at any time by clearing the appropriate bit of 
the DMAChanEn register or by disabling an SCB mapping or ISI receive operation. 



The DIU must guarantee the SCB enough bandwidth to ensure that neither a USB endpoint FIFO nor the 
ISI receive buffer can overrun. For example, to facilitate burety 32 Mbit/s transfers a SoPEC with a 64- 
byte ISI receive buffer would need to be able to transfer 256 bits every 1280 cycles (@160 MHz). This is 
in addition to the USB transactions targeted at the ISIMaster SoPEC which may be in die region of 8-9 
Mbit/s. While USB has a backpressure mechanism SoPEC should strive to obtain optimum USB band- 
width utilization and so USB backpressuring should only be used as a last resort. The DIU currently guar- 
antees 50 Mbit/s to the SCB and more bandwidth will be available when other DIU requestors do not take 
their slots. This is sufficient for the SCB's requirements. 



All of the circular buffer registers are 256-bit word aligned as required by the DIU. The DMAnBottomAdr 
and DMAnTopAdr registers are inclusive i.e. the addresses contained in diose registers form part of the cir- 
cular buffer.The DMAnCurrWPtr always points to the next location the DMA manager will write to so 
mterrupts are generated whenever the DMA manager reaches the address in either the DMAnlntAdr or 



12.8.2 DMA manager DRAM bandwidth requirements 



12.8.3 DMA manager configuration registers 
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DMAnMaxAdr registers rather than when it actually writes to these locations. It therefore cannot write to 
the location in the DMAnMaxAdr register. 



Table 39. DMA Manager Configuration Registers 



^^^^^ 
^^^^^ 








0x200 


OMAOBottomAdr 


17 


0x0.0000 


The 256-bit aligned DRAM address of the 
bottom of the drcular buffer senrtced by 
DMAChannelO 


0x204 


DMAOTopAdr 


17 


0x0.0000 


The 256-bit aligned DRAM address of the 

top of the drcufar buffer serviced by 
DMAChannelO 


0x208 


DMAOCunWPtr 


17 


0x0.0000 


The 2Sfi«bit alinned HRAJUl nHrtra<t« f>i tha 
next location DMAChannelO will write to. This 
register is set by the CPU at the start of a 
DMA operation and dynamicatly updated by 
the DMA manager during the operation. 


0x20C 


OMAOrntAdr 


17 




1 no «ioo-cjii eujgneo uhmm aaoress oi the 
location tfiat will trigger an interrupt when 
reached by DMAChannelO buffer. 


0x210 


DMAOMaxAdr 


17 




1 ne too-Dii aiigneo dram aaoress of the 
last free location in the DMAChannelO cfrcu- 
lar buffer. The DMAChannelO transfiers wilt 
stop when H reaches this address. 


vXZ14 


uMAOSeqDit 


i 
1 


uxu 


sequence bit tor OMACnannelO. This bit may 
be initialised by the CPU but is updated by 
the ISI each time an error- free long padcet is 
received. 


0x218 


DMA1 BottomAdr 


1 / 




1 ne doo-QW aligned dham address of uie 
tx>ttom of the circular buffer serviced by 
DMAChannell 


0x21 C 


DMAITopAdr 


17 


Ox0_0000 


The 256-bit aligned DRAM address of the 
top of the circular buffer serviced by 
DMAChannell 


0x220 


OMAlCurrWPtr 


17 


0x0.0000 


The 256-bit aligned DRAM address of the 
next location DMAChannell will write to. This 
register is set by the CPU at the start of a 
DMA operation and dynamically updated by 
the DMA manager during the operation. 


0x224 


DMAlfntAdr 


17 


0x0.0000 


The 256-bit aligned ORAM address of the 
location that will trigger an interrupt when 
reached by DMAChannell buffer. ' 


0x228 


DMAlMaxAdr 


17 


0x0.0000 


The 256-bit aligned DRAM address of the 
last free location in the DMAChannell drcu- 
lar buffer. The DMAChannell transfers wiD 
stop when it reaches this address. 


0X22C 


DMA1SeqB» 


1 


0x0 


Sequence bit for DMAChannell . This bit may 
be Initialised by the CPU but is updated by 
the ISI each time an error-free long packet is 
received. 


0x230 


DMAChanEn 


2 


0x0 


Enable DMA operation on a per drannel 
basis. Active high. 

DMAChanEn[0]: Enable DMAChannelO 
DMAChanEn(1]: Enable DMAChannell 
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Table 39. DMA Manager Configuration Registers 











0x234 


DMAStatus 


4 


0x0 


DMA status register. See section 12.6^.1. 
This register is Readonly. 


0x238 


DMAMask 


4 


0x0 


DMA mask register. See section 12.8.3.2 



12.8.3.1 DMAStatus register 

The contents of the DMAStatus register are read-only to the CPU. The status bits are not sticky bits i e 
they reflect the 'live' status of the channel. Status bits may only be cleared by writing to the relevant 
DMAn/niAdr or DMAnAfaxAdr regret. 

Table 40. DMA Status Register 









DMAChannerOlntAdrHit 


0 


DMAChannelO has reached the address contained in the 
DMAOfntAdr register 


DMAChannelOMaxAdrHit 


1 


DMAChannelO has reached the address contained in the 
DMAOMaxAcfr register 


DMAChannell IntAdrHIt 


2 


DMAChannell has reached the address contained In the 
DMA llntAdr register 


DMAChanned MaxAdrHH 


3 


DMAChannell has reached the address contained in the 
OAlATAfaxAdr register 



12.B.3.2 DMAMask register 

All bits of the DMAMask are both readable and writable by the CPU. The DMA manager cannot alter the 
value of this registcr.All interrupts are edge sensitive i.e the DMA manager will generate a dmajcujtrq 
pulse each time a status bit goes high and the corresponding mask bit is enabled. 

Table 41. DMA Manager Mask Register 









DMAChannelOlntAdrHitMask 


0 


1 as Generate an interrupt %vhen the DMAChannelOlntAdrHrt status 
bit goes high 

0 s Do not generate an interrupt when the DMAChannelOlntAdrHit 
status bit goes high 


DMAChannelOMaxAdrHitMasIc 


1 


1 = Generate an intenxrpt when the DMAChannelOMaxAdrHit status 
bit goes high 

0 = Do not generate an Internipt when the DMAChannelOMaxAdrHit 
status bit goes high 


DMAChannel 11 ntAdrH HMask 


2 


As per DMAChannetOlntAdrHltMask 


[pMAChannellMaxAdrHrtMask 


3 


As per DMAChannelOMaxAdrHltMask 



12.9 SCB Implementation 

This section is still a work in progress - the information here should be ignored as it refers to an earlier 
sion of the SCB 
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usbi.tx_en 



^ usb_tx_dp 



^ usD_tx_drn 



usb_rx. 



usbrx di 



tei-gp)o_dout 
^ • / 



lsl_g^_din 
i ^cpr.reset_n 



DMA 
Manager 



scb_diu_wreq 



^ — . 


diu_scb_wack 


— ! ► 


scb_diu_wva!Id 




scb diu rreq 


i P 


4 ' 


diu.ecb rack 


\ ^ 


4 


diu_8cb_rva!id 




scb^dtu wadr • 




scb_dlu_fadr 






scb_aiu_data 






diu_data 





dma_cpu,data 



dma_cpu_cntrl 



dma_8Cbs_data 



scbs_dma_data 



dma_8cfas_cntrl 



USB 



usb_scbs_data 



usb_scbs,cntrl 



■7^ 



ISI 



lsi_scbs_data 





scbsJsLdata 
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CPU 
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Interface 



U 



DRAM 



cpu_scb^el 



_cpu_rwn 
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scD_cpu jMy / 
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Z cpu_dataout 
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Switch 



Ooc: SoPEC_hardware_design 
Version: 2,3 



S3 Proprietary Document 



29 Nov 2002 
Page 134 




SoPEC : Hardware Design 




Characteristics of the data channels: 

USB: Packets should be moved sequentially out of the endpoint FIFOs. The USB is the slowest compo- 
nent in the SCB but its bandwidth is most precious. However both the DMA and ISI can transfer data (50 
and 40 Mbps respectively) much faster than the USB can receive data (12 Mbps peak rate) so no flow con- 
trol problems will occur due to a speed mismatch. If one of the DMA or ISI data sinks becomes blocked or 
inactive then the USB controller will assert backpressure (by NAKing packets) when the double buffer for 
the associated endpoint is filled. Other endpoints will remain active in this scenario and the DMA and ISI 
will still be able to transfer data at their peak rates. The worst case scenario is when all endpoints have 
their double buffers filled (because all the data sinks had been blocked/disabled) and then all data sinks 
become available again. In this case the backlog will be fully cleared in 3 USB 64-byte packet times. 
ISI: The ISI can support simultaneous reception and transmission of packets. ISI packets should be trans- 
ferred sequentially in either direction. The ISI is expected to handle the packet header and tiaOer, if any is 
used for error detection, in both directions i.e. only raw payload data is routed through the SCB rn^. 

DMA: The DMA channels are unidirectional but their direction, namely whether they are transferring 
data to or firom DRAM, is programmable. Each DMA transaction to DRAM will be 256 bits wide but all 
256 bits are not always valid. When a transfer of less than 256 bits is required the DMA manager pads the 
remaining bits in the 256-bit word with zeroes, in the case of a write to DRAM, or discards the unnecces- 
saiy bits in the case of a DRAM read. Can we get by with single (256 bits each way or maybe even 256 
bits in all ?) buffering for the DRAM manager ? 
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dnna_scbs_data 



scbs_dma_<Jata 



^ dma_sctis_cnlrt ^ 



usb_scbs_<Jata 



usb_scbs_cntrt 



>sl__scfas.da!a 



scbsjsi_data 



teLscbs_cntrl 
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dma_dout_rdy_WI1 :0] 



dma_dout 



■7^ 



dma_doLrt_valid 



dma_din_rdy 



dma_din_ld[1:0] 



dma.din 



dma^dtn^valid 



usb_ep_rdy[2:0J 



usb_rx_data 



usb_data_valld 



lsLdata_rdy_Id(5:0J 



isLrx_data 
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I »sl_tx_data 
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CPU 

Subsystem 
Interface 



Figure 41. SOB Switch block diagram 



SCB Switch pseudocode: 

const no_data_sinks = 12 

for i = 1 to no_data_sinks 
if (i <= 2) then 

sink_data is dma^din 
sink_rdy is dmo_din_rdy 
sink_data_valid is dina_din_valid 
sink_id is dxna_din_id 
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else 

sink^data is isi_cx_data 
sink:_rdy is isi_tx_r<^ 
sink^data_valid is isi_tx-.daca_valid 
sink_id is isi_tx_data_id 

if (daca_src.reg(il != 0) then // Each data sink has an associated data source 

register. A noa-zero value means the sink is enabled 
If ({data^src_regtil & OxFO) 0x10) then // A USB endpoint is the data source 
If ((usb^ep_rdyC4] 1) AND (usb.ep.rciy 13 :0J == data_src,regCil [3 :01 ) ) then 

// there is data waiting in the EP FIFO 
while ( <UBb_data.valid «== 1) AND (sink^rdy == 1) AND clocktick) 
8ink_data = usb_rx^data 
sink_data_valid = I 

if <i <*= 2) then // The sink is a DMAChannel 

sink_idtll = 1 

sink_id(0] » i -1 
else // The sink is an isi channel 

sink_id(5] = 1 
sink_idt4:0] = i -I 
else // There is no data ready to go 
sink_data_valid = 0 

elsif {data_src_reg & OxFO) 0x20) then //The ISI is the data source 

if (isi_data_rdy.id[3:03 data^src^regt i) 13 :0] ) then // there is data waiting 

// in the ISI receive FIFO for this ISISubId 
while (<isi_rx_data_valid == 1) AMD (sink^rdy == l) AND clocktick) 
sink_data = isi_rx_data 
sink__data_ valid = 1 

if (i <= 2) then // The sink is a DMAChannel 
sink_id(l] = 1 
sink_id(0] s i -1 
else // The sink is an ISI channel 
sink_id[5) = 1 
sink_id[4:0) = i -3 
else // There is no data ready to go 
sink_data_valid = 0 

elsif (data_src^reg t OxFO) == 0x30) then // The DMA is the data source 

if (dma^douc^rdy^idtO) == data^src^regCi 1 CO) ) then // there is data waiting 

//in the relevant DMA buffer for this sink 
while ((dma^dout.valid 1) AND (sink^rdy =» l) AND clocktick) 
sinlc^data = dma^dout 
sink_data_valid = 1 

if <i <= 2) then. // The sink is a DMA channel 
sink_id(l) » 1 
sink^idJO) » i -1 
else // The sink is an ISI channel 
sink_id[51 «= 1 
sink_id[4:0) = i -3 
else // There is no data ready to go 
sink_data_valid = 0 

The above pseudocode has a few shortcomings, particularly if all our data buses are not the same size but 
It shows the basic functionality the switch is supposed to offer. The main loop of the pseudocode (for i - 1 
to no^data^sinks) dictates what happens within one timeslot. The timeslots take as long as required to 
complete and loop around endlessly. The msb of the usb_ep_rdy[4:0J, isi_data_rdyjd[5:0] and 
dma_dout_rdyjdfj:0] signals is used to indicate that data is available in the relevant block. 
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13 General Purpose lO (GPIO) 



13.1 Overview 



The General Purpose lO block (GPIO) is responsible for control and interfacing of GPIO pins to the rest of 
the SoPEC system. It provides easily programmable control logic to simplify control of GPIO functions. 
In all there are 14 GPIO pins of which certain pins have special functions, their functions are detailed as: 

• 4 Motor control lOs internally pulled down 

• 4 General purpose high drive pulsed lOs capable of driving LEDs. 

• 4 Open drain lOs used for LSS interfaces 

• 2 Normal drive lOs used for the IS! interface in Multi-SoPEC mode 

Each of the pins can be configured in either input or output mode, each pin is independenUy controUed. A 
programmable de-glitching circuit exists for all input pins. Each ii^jut is a schmidt trigger to increase noise 
immunity should the input be used without the de-glitch circuit. The mapping of the above functions and 
tfieir alternate use in a slave SoPEC to GPIO pins is shown m Table 42 below. 

Table 42. GPIO pin functlonaflty 





gpio[3:0] 


Motor contfoJ pins / general purpose lO 


flpiof7:4J 


LED driver pins / general purpose fO 


flpio[11:81 


LSS interface pins / general purpose 10 


gp}o[13:12] 


ISI interface pins / general purpose lO 



13.2 Motor control 



The motor control pins can be directly controlled by the CPU or the motor control logic can be used to 
generate the phase pulses for the stepper motors. The controller consists of two central counters from 
which the control pins are derived. The central counters have several registers (see Table 44) used to con- 
figure the cycle period, the phase, the duty cycle, and counter granularity. 

There are two motor master counters (0 and I) with identical features. The period of the master counters 
are defined by the MotorMasterClkPeriod[l:0] and MotorMasterCtkSrc registers i.e. both master counters 
are derived from the same MotorMasterClkSrc. The MotorMasterClkSrc defines the timing pulses used by 
the master counters to determine the timing period. The MotorMasterClkSrc can select clock sources of 
I)i5,100^s,10ms and pc/A: timing pulses. 

The MotorMasterClkPeriod[I:0] registers are set to the number of timing pulses required before the tim- 
ing period re-starts. Each master counter is set to the relevant MotorMasterClkPeriod value and counts 
down a unit each time a timing pulse is received. 

The master counters reset to MotorMasterClkPeriod value and count down. Once the value hits zero a new 
value is reloaded from the MotorMasterClkPeriod [1:0] registers. This ensures that no master clock glitch 
IS generated when changing the clock period. 

Each of the lO pins for the motor controller are derived from the master counters. Each pin has indepen- 
dent configuration registers. The Motor MasterClkSelect [3:0] registers define which of the two master 
counters to use as the source for each motor control pin. The master counter value is compared with the 
configured MotorCtrlHigh and MotorCtrlLow registers. If the count is equal to MotorCtrlHigk value the 
motor control is set to 1, if the count is equal to MotorCtrlLow value the motor control pin is set to 0. 
This allows the phase and duty cycle of the motor control pins to be varied at pclk granularity. 
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The motor control generators can be paused at the end of a clock period by setting the MotorMasterClock- 
Enable register to zero. This aUows the CPU to re-configure the motor controller without causinc a iditcb 
on the output pins. * 



13.3 LEO CONTROL 



LED lifetime and brightness can be improved and power consumption reduced by driving the LEDs with a 
puked rather than a DC signal. The source clock for each of the LED pins is a 7.8kHz (128tis period) 
clock generated from the Ifis clock pulse from the Timers block. The LEDDutySelect registers are used to 
create a signal with the desired waveform. Unpulsed operation of the LED pins can be achieved by using 
CPU 10 duect control. By default the LED pins are controUed by the LED control logic 



r 



Master Clock 

LEODutySelect =0 | I 

LEDDutySelect =1 | 

LEDDutySelect =2 I 

LEDDutySelect =3 | 

LEDDutySelect =4 | j" 

LEDDutySelect =5 I 
LEDDutySelect b6 
LEDDutySelect ^7 



Figure 42. Duty Cycle Select 



13.4 LSS INTERFACE VIA GPIO 

In some SoPEC system configurations one or more of the LSS interfaces may not be used. Unused LSS 
mteiface pins can be reused as general lO pins by configuring the CpuIOCtrl register. When a bit in the 
CpulOCtrl IS set the corresponding pin is controUed by the CPU registers, otherwise the pin is controlled 
by the LSS block. By default the LSS controls the GPIO pins 1 1 to 8, 

1 3.5 ISI INTERFACE VIA GPIO 

In Multi-SoPEC mode the SCB block (in particular the ISI sub-block) requires dii^t access to and from 
the Q}iofJ2J and gpiofJ3J pins. Control of the ISI interface pins is detennined by the CpulOCtrl register. 
When a bit in the CpulOCtrl is set the corresponding pin is controlled by the CPU registers, otherwise the 
pm IS controlled by the ISI block directly. By default the pins are directly controUed by the ISI block. 
In single SoPEC systems the pins can be re-used by the GPIO. 

13.6 CPU GPIO CONTROL 

The CPU can assume direct control of any (or all) of the ZD pins individually. On a per pin basis the CPU 
can turn on direct access to the pin by setting the CpulOCtrl register. Once set the lO pin assumes the 
direction specified by the CpuIODirectio n register. When in output mode the value in register CpuIOOut 
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Will be directly reflected to the output driver. When in input mode the status of the input pin can be read in 
eiAer the direct version or a de-gUtched form, by reading CpuIOIn and CpuIOInDeglitch respectively 
When wntmg to the CpuIOOut register the top bits of die register (bits 29 to 16) are used to filter access to 
the lower bits (1 3 to 0). lu 

13-7 Programmable de-glitching logic 

Each 10 pin can be filtered through a de-glitching logic circuit The circuit can be configured to sample the 
lO pm for a predetennmcd time before concluding that a pin is in a particular state. The exact sampling 
^ IS configurable, but each GPIO pin must use one of two possible configured values (selected by 
DeGhtchSeiect), The sampling length is the same for both high and low states. The DeGlitchCount is pro- 
grammed to the number of system time units that a state must be valid for before the state is passed on 
The time units are selected by DeGlitchClkSel and can be one of l^is, 100ns, 10ms zndpclk pulses. 
For example if DeGlitchCount is set to 10 and DeGlitchClkSel set to 3. then an input pin (one of gpiofI3 
to OJ) must consistently retain its value for 10 system clock cycles (pclk) before the input state will be 
propagated from CpuIOIn to CpuIOInDeglitch. 

1 3.8 Interrupt generation 

Any of the GPIO pins can generate an interrupt from the raw or deglitchcd version of the input pin There 
are 14 possible mtemipt sources from the GPIO to the interrupt controller, one interrupt per input pin. The 
InterruptSrcSelect register determines whether the raw input or the deglitched version is used as the inter- 
rupt source. 

The interrupt type, masking and priority can be programmed in the interrupt controller. 

1 3.9 Frequency analyser 

The frequency analyser measures the duration between successive positive edges on an input pin and 
reports the last penod measured (FreqAnaLastPenod) and a running average period (FregAnaAverage). 
The running average is updated each time a new positive edge is detected and is calculated by 
FregAnaAverage = ( FregAnaAverage / 8 ) * 7 + FregAnaLastP^riod 1 8. 

The analyser can be used with any input pin (or its deglitched form), but only one pin at a time can be 
selected The pm is selected by the FregAnaPinSelect and its deglitched form can be selected by 
FregAnaPinFormSelect. 
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13.10 Implementation 

1 3.1 0.1 Definitions of I/O 



Table 43. t/O definition 









Clocks and Resets 


pcik 


1 


In 


System Clock 


prst^n 


1 


In 


System reset* synchronous active low 


tim_pulse[2:0] 


3 


In 


Timers block generated timing pulses. 

0 - 1 jis pulse 

1 - 100 (iS pulse 
2 -10 ms pulse 


CPU Interface 


cpu_addr[7:2] 


6 


fn 


CPU address bus. Only 6 bits are required to decode the 

address space tor thfs block 


cpu_dataout{31 rO) 


32 


In 


Shared write data bus from the CPU 


gpio_cpu_data[31 :0] 


32 


Out 


Read data bus to the CPU 


cpu_rwn 


1 


In 


Common read/hot-write signal from the CPU 


cpu_gpio_8el 


1 


In 


B^ock select from the CPU. When cpu _i7p^.SdMs high both 
qpu_acWrand cpu_dataout are valid 


Opio.cpu.rdy 


1 


Out 


Ready signal to the CPU. When gplo^cpu_rdy\s high it Indi- 
cates the last cyde of the access. For a write cyde this means 
cpt/^dataouf has been registered by the GPIO btock end (or a 
read cyde this means the data on gpiojGpu_data is vafkl. 


gpto_cpu_berr 


1 


Out 


Bus error signal to the CPU Indicating an InvaUd access. 


flpto_cpu_debug,valid 


1 


Out 


Debug Data valid on gpio_cpu_cfata bus. Active high 


cpu_acode[1:0] 


2 


In 


CPU Access Code signals. These decode as rotk>ws: 

00 - User program access 

01 - User data access 

10 - Supervisor program access 

11 - Supervisor data access 


ID Pins 


gpio_o[13:OJ 


14 


Out 


General purpose lO output to lO driver 


gpioJ[i3K)] 


14 


In 


General purpose lO input from lO receiver 


gpfo_e(i3:0] 


14 


Out 


General purpose lO output control. Active high driving 


GPIO to LSS 


tes_gpio_do{1 rO) 


2 


In 


LSS bus data output 
Bit 0 - LSS bus 0 
Bit 1 - LSS bus 1 


gpfo_lss_di[1:0] 


2 


Out 


LSS bus data Input 
BitO-LSSbusO 
Bit 1 - LSS bus 1 


lss_gpio_e(1 :0J 


2 


In 


LSS bus data output enable, active high 
Bit 0 - LSS bus 0 
Bit 1 - LSS bus 1 


•ss_5pio_dk[1 :01 


2 


In 


LSS bus clock output 
Bit 0 - LSS bus 0 
Bit 1 - LSS bus 1 


GPIO to IS! 
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Tabte 43. I/O definition 





am 


mm 




flplojsi_din(1:0] 




Out 


Input data from lO receivers to ISi. 


lsLgpio_dout|l.-0] 


2 


In 


Data output from ISI to lO drivers 




2 


In 


GPiO ISI pins output enable (active high) from ISI Interface 


Interrupts 


gpk)_teuJrq[13:0J 14 Out | GPIO pin Interrupts 


Debug 


debug_data_put[1 6:3] 


14 


In 


Output debug data to be muxed on to the GPIO pins 


debug_cntri[16:3] 


14 


In 


Control signal for each GPIO bound debug data line indicating 
whether or not the debug data should be selected by the 
mux 



13.10^ Configuration registers 

The configuration registers in the GPIO are programmed via the CPU interface. Refer to section 1 1.4.3 on 
page 70 for a description of the protocol and timing diagrams for reading and writing registers in the 
GPIO. Note that since addresses in SoPEC are byte aligned and the CPU only supports 32.bit register 
reads and writes, the lower 2 bits of the CPU address bus are not required to decode the address space for 
the GPIO. When reading a register that is less than 32 bits wide zeros should be returned on the upper 
unused bit(s) of gpio_pcu_data. Table 44 lists the configuration registers in the GPIO block 

Table 44. GPIO Register Definition 













CPU lO Control ■ rrr -i 


0x00 


CpiilOCtri 


14 


0x0000 


Indicates whether each lO pin is directly control- 
led by the CPU or not 

0 - Default Control 

1 - CPU Control 


0x04 


CpulOUserModeMask 


14 


0x0000 


User Mode Access Mask to CPU GPIO control 
register. When 1 user access is enak)led. One 
bit per gpio pin. Enables access to CpulODirBO- 
Hon, CpuiOOut, CpuiOin and CpuIOtnOegfitch 
in user mode if CpulOCM allows CPU access. 


0x08 


CpulOSuperModeMask 


14 


0X3FFF 


Supervisor Mode Access Mask to CPU GPIO 
oontrol register. When 1 supervisor access is 
enabled. One bit per gpio pin. Enables access to 
CpuiODimction, CpulOOut, CpufOlnand Cpul- 
OinDegfitch in supervisor mode if CpulOCM 
allows CPU access. 


OxOC 


CpulODirection 


14 


0x0000 


lodk^tes the direction of each lO pin, when con- 
trolled by the CPU 

0 - Indicates Input Mode 

1 - Indicates Output Mode 
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Table 44. GPIO Register Definition 







m 










30 


0X0000 
.0000 


Value used to drive output pin in CPU direct 
mode. 

bitsi 3:0 - Value to drive on output GPIO pins 
bits 15:14 - Reserved, (Read as zero always) 
bits 29:1 6 - Write enabie mask for bitsi 3:0, 0 
enables write. 1 masks the write. (Read as zero 
always) 


0x14 


CpulOln 


14 


Exter- 
nal pin 
value 


Value received on each Input pin regardless of 
mode. Read Only register. 


0x18 


CputOlnDegHtch 


14 


0x0000 


Deglitched version of CpulOln register Note 
that after reset this register will reflect the exter- 
nal pin values 256 pclk cydes after they have 
stabilized. Read Only regtsten 


Deglitch contr 


ol — ' 


0x20-024 


OeQIitchCount(1 :0] 


2x6 


OxFF 


De-glitch circuit sample count in DeGlitchCfkSrc 
selected units for pins gplo[13:0J 


0x28-20 


OeG(itchClkSrc[1:0] 


2x2 


0x3 


Specifies the unit use of the GPIO degiltch cir- 
cuits: 

0 • 1 ;is pulse 

1 - 100 ^s pulse 

2 > 1 0 ms pulse 
3-PC//C 


0x30 


DeGlitctiSelect 


14 


0x000 


Specifies which deglitch count (DeGlitchCount^ 
and unit select (DeGlitchClkSrd) should be used 
to deglitch each GPIO pin 

0 - Spedfies DeGntchCk)unt(0} and DeGlitchClk- 
SrcfOJ 

1 - Specifies DeGlitchCountfl land DeGUtchClk- 
Src[1] 


muiui wQiliroi 




0x34 


MotorCtrlUserModeEnabfe 


1 


0x0 


User Mode Access enat>le to Motor control con- 
ftguratfon registers. When 1 user access Is ena- 
bled. 

Enables user access to MoiorMasterakPeriod, 
MotorMasterClkSrc MotorDutySelect, Motor- 
PhasoSeiect, MotorMastBrCtock£rtab/e and 
MotorMasterClkSelect registers 


0x38 to 0x3C 


MotorMasterClkP^riod{1 :0} 


2x16 


0x0000 


Specifies the motor controller master dock peri- 
ods in MorofyVfas(e/C//rS/;c selected units 


0x40 


MotorMasterClkSrc 


2 


0x0 


Spedfies the unit use by the nnolor contfoller 
master dock generator: 

0 - 1 ^s pulse 

1 - 100 iis pulse 

2 - 1 0 ms pulse 
3' pclk 


0x44 to 0x50 


MotOfCtriHigh[3:0] 


4x16 


0x0000 


Spedfies the tow to high transition point in the 
dock period for each motor control pin. 


0x54 to 0x60 


MotOfarlLow[3:0) ~" 


4x16 


OxFFFF 


Spedfies the high to low transition point in the 
dock period tor each motor control pin. 
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Table 44. GPIO Register Definition 





1 8fii^felBifiE¥l!tf ffl^Ifijffl 




1 j^^^^ 




0x64 to 0x70 


MotorMasterClkSelect[3:0] 


4x1 


0x0 


Specifies which motor master dock should be 
used as a pin generator source 

0 - Clock derived from MotorMasterCtoCkPe- 

1 -Clock derived from MotorMastarCtockf^- 
riodflj 


0x74 


MotorMasterClockEnable 


2 


0x0 


Enable the motor master dock counter. When 1 
count Is enabled 

Bit 0 - Enable motor master ck>ck 0 
Bit 1 - Enable motor master ctock 1 


LEO control 




0x78 


LEDCtrlUserMocteEnable 


4 


0x0 


User Mode Access enable to LED control con- 
figuratk)n registers. When 1 user access is ena- 
bled. 

One bit per L£DDu<ySe/ecf select register. 


0x7C to 0x88 


LEDDutySelect[3:0] 


4x3 


0x0 


Specifiea the duty cyde for each LED pin. See 
Figure 42 for encoding details. The LEDDutySe- 
/0df3:p/ registers determine the duty cyde of 
the gpi<^:4JpinB 


Frequency Am 


ilyeer 


0x8C 


FreqAnaPinSelect 


4 


0x00 


Selects which QPIO input shouk5 be used for the 
frequency analyses. 


0x90 


FreqAnaPlnFormSelect 


1 


0x0 


Selects if the frequency analyser should use the 
raw input or the deglHched form. 

0 - Degntched form of Input pin 

1 - Raw form of Input pin 


0x94 


FreqAnaLastPeriod 


16 


0x0000 


Frequency Analyser last period of selected Input 
pin. 


0x98 


Re q AnaAve rage 


16 


0x0000 


Requency Analyser average period of selected 
input pin. 


0x9C 


FreqAnaCountlnc 


20 


0x0000 
0 


Frequency Analyser counter Increment arrraunt. 
F=6r each dock cyde no edge is detected on the 
selected input pin the accumlator Is increntented 
by this amount. 


Mtsceflaneous 


OxAO 


InterruptSrcSelect 


14 


0x000 


Interrupt source select. 1 bit per GPIO pin. 
Determines whether the interrupt source is 
direct form the input ptn or the deglitched ver- 
sion 

1 - Input pin direct 

0 - Deglitched Input pin 


0xA4 


DebugSeiect 


6 


0x00 


Debug address select Indicates the address of 
the register to report on the gpio_cpu_data bus 
wtien it is not otherwise t>eing used. 


0xA8-0xAC 


MotorMasterCount 


2x16 


0x0000 


Motor master dock counter values. . 
Bus 0 - Master dock count 0 
Bus 1 - Master dock count 1 
Read Only registers 
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13. 10.2.1 Supervisor and user mode access 

The configuration registers block examines the CPU access type {cpujacode signal) and determines if the 
access is allowed to that particular register, based on configured user access registers. If an access is not 
allowed the GPIO will issue a bus error by-asserting the gpio^cpujjerr signal. 

Access to the CpuIODirection, CpuIOOut, CpuIOIn and CpuIOInD^litch is filtered by the CpuIOUser- 
ModeMask and CpulOSuperModeMask registers. Each bit masks access to the corresponding bits in the 
CpuIO* registers for each mode, with CpuIOUserModeMask filtering user data mode access and CpuIO- 
SuperModeMask filtering supervisor data mode access. 

The addition of the CpuIOSuperModeMask register helps prevent potential conflicts between user and 
supervisor code read modify write operations. For example a conflict could exist if the user code is inter- 
rupted during a read modify write operation by a supervisor ISR which also modifies the CpuIO* registers. 
An attempt to write to a disabled bit in user or supervisor mode will be ignored, and an attempt to read a 
disabled bit returns zero. If there are no user mode enabled bits then access is not allowed in user mode 
and a bus error will restilt Similarly for supervisor mode. 

When writing to the CpuIOOut register, bits 29 to 16 are used to mask the write to the CpuIOOutfJ 3:0], If 
the mask bit is zero the write is active to corresponding CpuIOOut pin, otherwise the write to that pin is 
Ignored. . 

The pseudocode for determining access to the CpuIODirection register is shown below. Similar code could 
be shown for the CpuIOOut, CpuIOIn and CpuIOInDeglitch registers. 

if (cpu_acode == SUPERVISOR_DATiV-.MODE) then 
// supervisor mode 

if <CpulOSiuperModeMask[13:0] 0 ) then 

// access is denied* and bus error 

SPio_cpu_berr = 1 
elsif (cpu^rwn 1) then 

// read mode 

apio_cpu_data (13:01 = < CpuIOOut ( 13 :0) & CpuIOSuperModeMask f 13 : 0] ) 
else 

// write mode, filtered by mask' 

maskU3:0J = -(cpu_dataout{29:16) ) & CpuIOSuperModeMask [ 13 :0) 

CpuIOOutfl3:03 = (( cpu_dataouttl3:0) & inask[13:0] ) 1 
( CpuIOOut [13: 01 & -(inaskI13:0]3) )) 
elsif (cpu.acode «» USER_DATAJ«ODE) then 
// user datafiiode 

if (CpuIOUserModcMask(13:0] 0 y then 

// access is denied, and bus error 

gpio_cpuJI>err = 1 
elsif (cpu_rvm == 1) then 

// read mode, filtered by mask 

gpio_cpu_data = ( CpuioOut [ 13 : 0] & CpuIOUserModeMask [13 :0 ) ) 
else 

// write mode, filtered by mask 

mask[13:01 = - (cpu.dataout [29 : 1 6 ] ) & CpuIOUserModeHask [ 13 ; 0] 

CpuIOOut [13: 01 = (( cpu_dataout(13:01 & inaBk[13:01 > | ' 
( CpuIOOut(13:0J & -(inask[13:0]l))> 

else 

// access is denied, bus error 
gpio_cpu_berr = 1 
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Table 45 details the access modes allowed for registers in the GPIO block. In supervisor mode all 
are accessible. In user mode forbidden accesses will result in a bus eiror (gpio_cpu_berr asserted). 



Table 45. GPIO supervisor ami user access modes 











linijirjfjfii 


Supen^sor data mode only 


0x04 


\>|iUl w W9t7i IVK/U&IViaSK 


Supervisor data moda only 


0x08 


opuiwoupenvioaeMaSK 


Supervisor data mode only 


OxOC 




CpulOUserModeMask and CpulOSuperModeMask filtered 


0x10 




CpuIOUserModeMask and CpulOSuperModeMask filtered 


0x14 


wpuioin 


CpulOUserModeMask and CpulOSuperModeMask filtered 


lO 


cpuiuinuegutcn 


CpuIOUserModeMask and CpulOSuperModeMask filtered 




DeGntchCount(1 :0J 


Supervisor data mode only 


0x28-2C • 


uevjittchCikSr^l :0j 


Supervisor data mode only 


uxou 


DeufitchSelect 


Supen/isor data mode only 


UXo4 


MotorCtrtUserModeEnable 


Supervisor data mode only 


UX9o to UXoC 


MotorMasterClkPeriod[1 :0j 


MotorCtrtUserModeEnable enabled 


0x40 


MotorMasterOkSrc 


MotorCtrtUserModeEnable enabled 


0x44 to 0x50 


MotorCtiiHinhrS'OI 


MotorCTtnuserMode Enable enabled 


0x54 to 0x60 


MotorCtrlLow{3:0] 


MotorCtrlUserModeEnable enabled 


0x64 to 0x70 


MotorMasterCIkSelect[3:0] 


MotorCtrtUserModeEnable enabled 


0x74 


MotorMasterClockEnable 


MotorCtrlUserModeEnable enabled 


0x78 


LEDCtriUserModeEnaWo 


Supervisor data mode only 


0x60 


LEDDulySefectCO] 


LEDCtr1UserModeEnable[0} enabled 


0x84 


LE00utySelect[1] 


LEOCtrlUserModeEnable[1) enabled 


0x74 


LEDDutySelect(2} 


LEDCtr1UserModeEnable[2] enabled 


0x88 


LE0DirtySelect(3] 


LEDCtriUserModeEnable(3] enabled 


Ox8C 


F=reqAnaPinSelect 


Supervisor data mode only 


0x90 


FreqAnaRnFormSelect 


Supervisor data mode only 


0x94 


FreqAnaLastPeriod 


Supervisor data mode only 


0x98 


FroqAnaiAverage 


Supervisor data mode only 


0x9C 


ReqAnaCountInc 


Supervisor data mode only 


OxAO 


Inte mjptSrcSelect 


SupenASOr data mode only 


0xA4 


DebugSelect 


Supervisor data mode only 


OxAS-OxAC 


MotorMasterCount 


Supervisor data mode only 
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13.10.3 GPIO partition 
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Figure 43. GPIO partition 



13.10.4 iOcontrof 



The lO control block connects the lO pin drivers to internal signalling based on configured setup registers 
and debug control signals. 



The motor, LED pins, ISI and LSS control logic: 
// motor and led pins 
for (i=0; i<14 ; ( 

if (debug__cntrl [ij == 1) then 

gpio_e{iJ a 1 

OPio_o[il = debug_data_out li) 
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cpu_io_in(iI = gpio_i(iJ 
if (cpu_io_ctrl(il == 1) then 

opio_e(i) = cpu_io_dir (ij 

gpio— o(i) = cpu^io^out (iJ 

cpu_io_in[il e gpio_iCi] 
else 

// default control 

if < i < 4 ) then // motor control pins 

gpio_G[lJ c 1 

gpio_o(i] ~ motor_ctrl (i) 

cpu_io_in[il = gpio_i[iJ 
elsif ( i < 8 ) then // LED pins 

gpio_e(i) = l 

gpio_o[ij led^ctrlCi) 

cpu_io_in[i) = gpio_iCi) 
elsif (i < 10) then // LSS interface clock pins 

gpio_e(i) = 1 

gpio_o(ij • l3s_gpio_clk[i-8] 

cpu_io_in[i) = gpio_i(i] 
elsif (i < 12) then // LSS interface data pins 

gpio_e[i} = lss_gpio_€[i-10J 

gpio_oCiJ = lss_^io_do(i-10] 

lss_spio_diti-101 = gpio^iCi) 
else // isi interface' pins 

gpio_eCiJ = isi_gpio_e(i-12J 

9pio_o[i] «5 isi_gpio_dout[i-12] 

isi_gpio_dinIi-12J = gpio_i(i3 

} 

13.10.5 LED pulse generator 

The pulse generator logic consists of a 7-bit counter that is incremented on a l|Jis pulse from the timers 
block (tim^ulse[Ojy The LED control signal is generated from comparing the count value with the con- 
figured duty cycle for the LED {led_duty_sel). 

The logic is given by: 

for (i=0 i<4 ;i^+) { // for each LED pin 
// period divided into 8 segments 
period_di v8 = cnt [6:43; 

if (period_div8 <= led_duty_sel ti) ) then 

lGd_ctrl(l) = 1 
else 

led_ctrl(i] = 0 
// in higher half invert the led control 
if (cnt{6J 1) then 

led_ctrl(il « - led_ctrlCiJ 

) 

// update the counter every lus pulse 
if (tim_pulse(0J == 1) then 
cnt ++ 

13.10.6 Motor control 

The motor controller consists of 2 counters, and 4 phase generator logic blocks, one per motor control pin. 
The counters decrement each time a timing pulse (cnt^en) is received. The counters start the configured 
clock period value (motor^mas^clk^eriod) and decrement to zero. If the counters are enabled (via 
motor_mas_clkjsnable), the counters will automatically restart at the configured clock period value, oth- 
erwise they will wait until the counters are re-enabled. 
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The timing pulse period is one ofpclk, Ijis. lOOjis, 1ms depending on the motor_masjclkjsel signal. The 
counters are used to derive the phase and duty cycle of the of each motor control pin. 
// decrement logic 
if (cnt_en l) then 

if ( (mas_cnt == 0) AND (motor_mas_clk_enabl© ==1)) then 

mas_cnt = motor_jnas_clk_period[15 : 0} 
elsif <(raas_cnt == 0> AND (inotor_»as_clk^en«ble 0)) then 

mas^cnt = 0 
else 

mas_cnt — 
else // hold the value 
mas_cnt = mas_cnt 



.mas.dk_src \ 

tim_pulsc(Oj- " 
timjHilse(1]- 



nwtor_mas_dk^rtod(OJ 
'notOf.mas.dK.enableloj 




motor^ctri 



motor_fnas.c(K.pei'iod[1 ) — ^^-^ 
motor_mas_clK-ena!ble(1 ] ^ 



motor^mas.count 



Figure 44. Motor control RTL diagram 

The phase generator block generates the motor control logic based on the selected clock generator 
(jnotor_mas_clk_set) the motor control high transition point iptotor^ctrljiigh) and the motor control low 
transition point {motor_ctrlJow), There are 4 instances one per motor control pin. 
The logic is given by: 

/ / select the input counter to use 
if <motor_jiias_clk_sel == 1) then 

count = mas^cnttU 
else 

count = nascent [0] 
/ / Generate the phase and duty cycle 

if ( (inotor_ctrl 1 > AND (count == motor_ctrl_low) > then 
motor_ctrl = 0 

elsif nmotor_ctrl ==» 0) AND {count == motor_ctrl_high) ) then 

niotor_ctrl = 1 
else 

motor^ctrl = motor.ctrl // remain the same 
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S5 



13.10.7 Input deglitch 

The input deglitch logic rejects input states of duration less than the configured number of time units 
{deghtch cnt) mpxxt states of greater duration are reflected on the output cpujojn deglitch The time 
units used (either PC/*, l^xs. I00|is. 1 ms) by the deglitch circuit is selected by the ^e^/iTcAl^^^^ 

There are 2 possible sets of deglitch^cnt and deglitch_clk^src that can be used to deglitch the input nins 
The values used are selected by the deglitch jsel signal. 

Each input pin can be used to generate an interrupt. The interrupt can be generated from the raw input sig- 
nal or a deglitchcd version of the input The interrupt source is selected by the interrupt_src_seiect signl 
The counter logic is given by 
if ( cpu_io_in 1= cpu_io_in_delay) then 
a deglitcK_cnt 

output_en c 0 
elsif (cnt == 0 ) then 

cnt a cnt 

output_en o 1 
elsif (cnt_en c= i) then 

cnt 

output_en = 0 



cpu_ioJn • 



timj>ulse(0]- 
tlfn_pulseCl|- 
tfm_polseI2]- 
1- 



cpu_lo_ln_deIay 



cnt,en 



deotitch_clK«sel(OJ 
<ieglttch_clk_se^lj 

d©QjJtch_cnt(01 
<tegntch_cni|11 

degJltctusel- 




Counter 
Logic 



>en 



Compare 



outputjsn 



-> cpu_lo.In_deQHteh 



cpu_lo. 
int8rrupt.src_Gel 



_3 



gpio.icuj/q 



Figure 45. Input de-glitch RTL diagram 



13.10.8 Frequency Analyser 

The frequency analyzer block monitors a selected input pin (selected by FreqAnaPinSelect and FreqAnaP- 
inFormSel) and detects positive edges. Between successive positive edges detected on the input pin it 
increments a counter by a programmed amount {FreqAnaCountlnc} on each clock cycle. When a positive 
edge IS detected the FreqAnaLastPeriod register is updated with the top 16 bits of the counter and the 
counter IS reset. The frequency analyser also maintains a running average of the FreqAnaLastPeriod regis- 
tcr. Each Ume a positve edge is detected on the input pin the FreqAnaAverage register is updated with the 
new calculated FreqAnaLastPeriod The average is calculated as 7/8 the current value plus 1/8 of the new 
value. Both the FreqAnaLastPeriod and FreqAnaAverage registers can be written to by the CPU. 



The pseudocode is given by 

if (<pin == 1) AND pin_dalay »»0 > ) then 
fre<j_ana_lastperiod = count {31: 16] 
freq_ana_average = f rec^ana^average - fre(i.ana„average/8 * £req_ana_lastpGriod/8 



// positive edge detected 
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15 



count c 0 
else 

count = count + Creq_ana„coiint_inc 
// implement the configuration register write 
if (*or_last_en sa i) then 

f re(i_ana.lastperiod s wr.data 
elsif (wr_average__en 1 ) then 

f req.ana_average » wr^data 



cpu.lo Jn.deglitc>4 1 3:0} 
Cpu_to_ln(l'3:01 

Ireqjana^irus8l(a:0] 




WT_data(15.-0| — ^ 
wr_la8t_6n 
wr.average.en 



20, 



freq_ana_couoUnc — 7^ 



Anafyser Logic 





16 




► 








> 





-> freq.anajast_pcrtod(l5:0] 



16 



freq.janajRverage[15:0] 



^.3 count 



Figure 46. Frequency analyser RTL diagram 
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14 Interrupt Controller Unit (ICU) 



The mtemipt controller accepts up to N input interrupt sources, determmes their priority, arbitrates based 
on the lu^ pnonty and generates an intcnupt request to the CPU. The ICU complies with the interrupt 
acknowledge protocol of the CPU. Once the CPU accepts an interrupt (i.e. processing of its service routine 
begins) the interrupt controller wiU assert the next arbitrated interrupt if one is pending. 

Each intemipt source has a fixed vector number N, and an associated configuration register, IntRegfNl 
The format ofthe/n/K^/A7 register is shown in Table 46 below. 

Table 46. lntReg[N] register format 





mi: 




rnorny 


7:0 


Interrupt priority 


Typo 


9:8 


Determines the triggering contfitk)ns for Ihe interrupt 

00 - Positive edge 

10- Negative edge 

01 - Positive level 

11- Negative level 


Mask 
1 Reserved 


10 


Mask bit. 

1 - Interrupts from this source are enabled, 
0 - Interrupts from this source are dlsabted. 

Note that there may be additional masks in operation at the source of the 

interrupt 




31:11 


Reserved. Write as 0. 



— „ uiicnupi coiuroiier oeremunes tne pnonty and maps the programmed ori- 

onty to ihc available CPU priority levels, and then issues an intemipt to the CPU. The mapping of pro- 
grammed pnonty to native intenrupt levels will be fixed, and is dependent on CPU choice. 
For example for the LEON CPU there are 15 levels available which would allow 16 sub-priorities per level 
(as each level is in itself a priority). In this case priorities 255-240 m^ to level 15. 240-224 to level 14 and 
so on, with pnonties 15-0 conesponding to level 0. Level 0 is no intenupt Level 15 is the highest interrupt 



14.1 Interrupt preemption 

There are two types of pre-emption possible: standard LEON pre-emption and SoPEC pending pre-emp- 
tion. With standard LEON pre-emption an interrupt can only be pre-empted by an intemipt with a higher 
pnonty level. If an intemipt with the same priority level (1 to 15) as the intemipt being serviced becomes 
pendmg then it is not acknowledged until the cun-ent ser^'ice routine has completed. The SoPEC pending 
pre-emption IS an extension of the standard LEON scheme which is made possible by the prognunmabic 
pnonty levels m the /«ri?^/W7 register. ' v b uuowic 

Intemipts with a higher sub-priority will pre-empt intemipts with a lower sub-priority but the same prior- 
ity level mappmg. if the interrupt has not been acknowledged by the CPU i.e. it is still pending. If an inter- 
rupt with a higher sub-pnority arrives while an intenupt with a lower sub-priority at the same level is 
oeing serviced then it will not be serviced until the lower sub-priority service routine has completed. 
Thus when pre-emption is required, intemipts should be programmed to different levels as intemmt prior- 
mes of the same level have no guaranteed servicing order. 

The interrupt is directty acknowledged by the CPU and the ICU automatically clears the pending bit of 

acJcnowiedged interrupts. ^ 
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All interrupt controller registers are only accessible in supervisor data mode. If the user code wishes to 
mask an intemipt it must request this from the supervisor and the supervisor software will resolve user 
access levels. 

1 4.2 Interrupt sources 

The mapping of interrupt sources to interrupt vectors (and therefore IntReg[N] registers) is shown in 
Table 47 below. Please refer to the appropriate section of this specification for more details of the interrupt 
sources. 



Table 47. Interrupt sources vector table 







0 


Timers 


WatchDog Timer Update request 


1 


Tlmeis 


Generic Timer 1 interrupt 


2 


Timers 


Generic Timer 2 interrupt 


3 


Timers 


Generic Timer 3 Interrupt 


4-17 


GPIO 


GPIO generaJ Interrupt, source pin 0 *13 


18 


MMU 


MMU Security violation 


19 


SCB 


USB interrupt 


20 


SCB 


iS) interrupt 


21 


SCB 


DMA interrupt 


22 


LSS 


LSS Interajpl, LSS Interface 0 interrupt request 


23 


LSS 


LSS Interrupt* LSS Interface 1 interrupt request 


24 


PCU 


PEP Sub-system tnterrupt- CDU finished band 


25 


PCU 


PEP Sub-system Interrupt- CDU error 


26 


PCU 


PEP Sub-system Interrupt* LBD finished band 


27 


PCU 


PEP Sut>-system Interrupt- T£ finished band 


28 


PCU 


PEP Sub-system Interrupt- PCU finished band 


29 


PCU 


PEP Sub-system Interrupt- PCU invalid address Interrupt 


30 


PCU 


PEP Sut>-system Interrupt- PHI Buffer underrun 


31 


PCU 


PEP Sub-system Interrupt- PHI Page finished 


32 


PCU 


PEP Sub-system Interrupt- PHI Print ready 


33 


PHI 


PEP Sub-system Interrupt- PHI Une Sync Interrupt 
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14.3 Implementation 

14.3.1 Definitions of I/O 

Table 48. Interrupt Controller Unit I/O definition 



mm 



crocks and Resets 


pcfk 


1 


In 


System Clock 


prst_n 


1 


In 


System reset, synchronous active tow 


CPU Interface 


cpu_adrt7:2] 


6 


In 


CPU address bus. Only 6 bits are required to decode the 
address space for the ICU block 


cpu_dataout[31:0] 


32 


In 


Shared write data bus from the CPU 


lcu_cpu.data[31.'0] 


32 


Out 


Read data bus to the CPU 


cpu_rwn 


1 


In 


Common read/not-write signal from the CPU 


cpujcu^sel 


1 


In 


Block select from the CPU. When cpu^/cu^sel is high both 
c^U.acfrand cpc/_daeaouf are valid 


icu.cpu.rdy 


1 


Out 


Ready signal to the CPU. When icu^cpu^rdyls high it indi- 
cates the last cycle of the access. For a write cycle this 
means cpc/_c/aeao(/f has been registered by the ICU block 
and for a read cyde this means the data on icujcpujdata is 
valid. 


teu_cpu_nevel[3:0) 


4 


Out 


Indicates the priority level of the current active Interrupt 


cpu_iack 


1 


Out 


Intemipt request acknowledge from the LEON core. 


cpu_lcujlevel(3:0] 


4 


In 


Interrupt acknowledged level from the LEON core 


icujcpu_berr 


1 


Out 


Bus error signal to the CPU indicating an invalid access. 


cpu_acode[1 :0] 


2 


In 


CPU Access Code signals. These decode as foDows: 

00 • User program access 

01 - User data access 

10 * Supervisor program access 
11- Supervisor data access 


icu.cpu_debug.valld 




Out 


Debug Data valid on icu cpu^<iata bus. Active high 


Interrupts 


timjcu_wdjrq 




In 


Watchdog timer Interrupt signal from the Timers btod< 


timjcujrq{2:0j 




In 


Generic timer interrupt signals from the Timers block 


gpio_icu_irq(13:0] 


14 


In 


GPIO pin Interrupts 


mmu_icujrq 




In 


Memory Managemem Unit interrupt 


usb_icujrq 




in 


USB interrupt from the SCB 


isijcujrq 




In 


ISI interrupt from the SCB 


dma_icu_ifq 




In 


D MA in te rrupt from the SCB 


Issjrcujrqfl .-OJ 




In 


LSS intertace interrupt request 


cdu_finishedband 




In 


Finished band Interrupt request from the CDU 


cdujctijpegerror 




In 


JPEG error interrupt from the CDU 


lbd_finishedband 




in 


Rnished band tntenupl request from the LBO 


te^finishedband 




In 


Rnlshed band inten-upt request from the TE 


pcu^finfshedband 




tn 


Finished band Intemipt request from the PCU 


pcujcu_add ress Jnvalid 




In 


Invalkl address Interrupt request from the PCU 
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i3 



Table 46. Interrupt ControUer Unit UO deflnmon 









p»M_igu_unoerrun 


1 


rn 


Buffer underrun imorrupl request from the PHI 


p hijcu_pag e_fi nish 


1 


In 


Page finished Interrupt request from the PHI 


phi_iciJ_prtnt_rdy 


1 


in 


Print ready Interrupt request from the PHI 


phLicuJinesyncJnt 


1 


In 


Une sync Interrupt request from the PHI 



14.3.2 



Configuration registers 

The configuration registers in the rCU are programmed via the CPU interface. Refer to section 1 1.4 on 
page 69 for a descnption of the protocol and timing diagrams for reading and writing registers in the ICU 
Note that smce addresses in SoPEC are byte aligned and the CPU only supports 32-bit register reads and 
wntes, the lower 2 bits of the CPU address bus are not required to decode the address space for the ICU 
When reading a register that is less than 32 bits wide zeros should be returned on the upper unused bit(s) 
of ICU jcu^data. Table 49 lists the configuration registers in the ICU block. 

The ICU block will only aUow supervisor data mode accesses (i.e, cpu acode[l:0] 
SUPERVISOR_DATA). All other accesses will result in icu^cpujberr being asserted. 

Table 49. ICU Register IMap 















0x00 • 0x84 


lntReg(33:0] 


34x11 


0x000 


Interrupt vector configuration register 


0x86-0x8C 


IntCleaitliO] 


2x32 


0x0000 
_0000 


Intenupt pending clear register If written with a one 
it dears corresponding interrupt 
IntClearfO] • Interrupts sources 31 to 0 
IntClearfl] - Interrupts source 33 to 32 


0x90-0x94 


intPendlngtl.-OJ 


2x32 


0x0000 
_0000 


Interrupt pending register. (Read Only) 
IntPendtnglO] > Interrupts sources 31 to 0 
IntPendlngCI] - Interrupts source 33 to 32 


0x98 


IntSource 


6 


0x00 


Indicates the interrupt source of the current winning 
active interrupt. (Read Only) 


0x9C 


DebugSelect 


6 


0x00 


Debug address select Indicates the address of the 
register to report on the icu^cpu^data bus when it 
Is not otherwise being used. 
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14.3.3 ICU partition 

tbnjcu^wd frq - 
tim_teyjfql2:0j - 
flplo_lcujrq[i3:ol - 
mmu^icu 'Irq - 
usb Jcujrq - 
IsLicu Jrq - 
dma^fcu Irq - 
lssjcujrq[i:0| - 
cdu_ftnishodband ~ 
odu^lcu..fc)eoerTor - 
tbd.Hnishedband - 
fee^finishadband - 
poj.firtishedband - 
pcu.icu.address.lnvalld - 
phi_lcu_page_flnish - 
phlJcu_prtnUfdy - 
phJJcuLunderrun — 
pW Jcu Jinesync^Int - 



X34 



tni_sfc ^ ^1 



Interrupt 
detect 



8 



34x12 



Interrupt 
arbiter 



cpujnt^dear 



t t 



Configuration 
registers 



AAA 



i 



1 



Interrupt 
controller 



/a 



CPU 



Figure 47. ICU partition 



14.3.4 Interrupt detect 

The ICU contains multiple instances of the interrupt detect block, one per interrupt source. The interrupt 
detect block examines the interrupt source signal, and determines whether it should generate request pend- 
ing (int^end) based on the configxued interrupt type and the interrupt source conditions. If the interrupt is 
not masked the interrupt will be reflected to the interrupt arbiter via the int_acHve signal. Once an interrupt 
is pending it remains pending until the intermpt is accepted by the CPU or it is level sensitive and gets 
removed Masking a pending interrupt has the effect of removing the interrupt from arbitration but the 
interrupt will still remain pending. 

When the CPU accepts the interrupt (using the normal ISR mechanism), the interrupt controller automati- 
cally generates an interrupt clear for that interrupt source (cpujnt_clear). Alternatively if the interrupt is 
masked, the CPU can determine pending interrupts by polling the IntPending registers. Any active pending 
interrupts can be cleared by the CPU without using an ISR via the IntClear registers. 
The logic is shown below: 

tx\Ask 8 int_conflg(10) 

type := int:„conf ig 1 9 : 8 ] 

int^priority = inC_conf igC? : 0] 

int_j)end s last_int_pend // the last pending interrupt 

// update the pending ff 

if (<int_clear == 1 )OR (cpu_int_cleare=l) ) then 

int_pend « 0 
// test for interrupt condition 

if ((type == NEG_LEVEL > AND (int_src == 0) then 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 156 



SoPEC : Hardware Design 



inc_pend == 1 
elsif ((type POS.LEVEL) AND (int_src == 1) 
int_pend = 1 

elsif ((type NEG.EDGE ) AND (int_src == I) AND (last.int.src 0)} 

int^pend = 1 . *" 
elsif ((type POS^EDGE ) AND (int_src == 0) AND ( last_int_src « lU 

int^end « 1 
else 

int^pend = last_int_src // stay the same as before 
// mask the pending bit 
if (mask 1) then 

int_active = int_pend 
else 

int_active » 0 
// assign the registers 
last_int.src = int_src 
last.int^pend » int_pend 

14.3.5 Interrupt arbiter 

The inteiTupt arbiter logic arbitrates a winning interrupt request from multiple pending requests based on 
confijgured priority. It generates the interrupt to the CPU by setting icu^cpujlevel to a non-zero value. The 
priority of the interrupt is reflected by the value assigned to icu^cpujlevel the higher the value the higher 
the priority, 15 being the highest. The current winning interrupt and is reported to the CPU via the IntSrc 
register generated in the interrupt arbiter block. 
// arbitrate- based on priority 
if (arb_enable == 1 ) then 

// arbitrate with the current winner 
win_int_priority = 0 
int^src = 0 

int_re<juest » 0 

for (i=0;i<34;i++) { 

if ( int_activG(i] == 1) then { 

if {int_priority[il > win_int_pi^iority ) then 
win_int_priority = int__priority ( il 
int^src = i 

int_request = l 

} 

> 

> ■ 

// assign the CPU interrupt level 
int^ilevel * int_priority(int_srcl 17:41 
) 

14.3.6 Interrupt controller 

The interrupt controller is responsible for generating the interrupt to the CPU, accepting the interrupt 
acknowledge from the CPU and clearing the interrupt source pending bit 
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The exact procedure is CPU dependent, but examples are given for the LEON processor. See section 1 1.9 
on page 98 for a complete description of the interrupt handling procediire. 



Reset 



C 



_3 



.enable = 1 



InlPend 



)lcu_c 
arb.( 



.cpujlevel =int_Uevel 
.enables t 



Machine remains In same state by defiauN 
An outputs are zero unless otherwise stated 
State Description: 
Reset : Nomr\al reset state 

IntPend: interrupt pending, waiting for CPU acknowledge 

IntClean Intemipt dear, dear the pending bit for the 
current interrupt vector 



eotj laekil Ahin 

CPU feu IteyBfciteu rpil iAyiaf 



IntClear ^ 



cpu_lm_deai(lnt,src5=1 
arb_enable = o 



Figure 48. Interrupt controller state diagram 



After reset the interrupt controller remains in the Reset state until the interrupt arbiter indicates that there is 
an active intenrupt pending (int^request equal 1 ). The state machine goes to the IntPend state and signals to 
the CPU that an interrupt is pending. The machine will remain in the IntPend state until the intemipt is 
acknowledged by the CPU or the pending inteiTiq>t condition is removed. 

When the interrupt is acknowledged the state machine goes to the IntClear state to clear the pending bit of 
the intemipt source. 

On completion the state machine returns to the Reset state and again waits for the next pending interrupt 
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15 Timers Block (TIM) 



The Timers block contains general purpose tuners, a watchdog timer and timing pulse generator for use in 
other sections of SoPEC. 



1 5.1 Watchdog timer 



The watchdog timer is a 32 bit counter value which counts down each time a timing pulse is received. The 
period of the timing pulse is selected by the WatchDogUnitSel register. The value at any time can be read 
from the WdtchDogTimer register and tiie counter can be reset by writing a non-zero value to the register. 
Should the counter reach 1, a system wide reset will be triggered as if the reset came from a hardware pin. 

The watchdog timer can be polled by the CPU and reset each time it gets close to 1 , or alternatively a 
threshold (WdtchDoglntThres) can be set to trigger an interrupt for the watchdog timer to be serviced by 
the CPU. This internet can be effectively masked by setting the threshold to zero. The watchdog timer can 
be disabled, without causing a reset, by writing zero to the WatchDogTimer register. 



1 5.2 Timing pulse generator 



The timing block contains a timing pulse generator clocked by the system clock, used to generate timing 
pulses of Ijis, 100|is and 10ms. Each pulse is of one system clock duration and is active high, with the 
pulse period accurate to the system clock frequency. 

The timing pulse generator also contains a 64-bit free running counters that can be read or reset by access- 
ing the FreeRunCount register. 



15.3 Generic timers 



SoPEC contains 3 programmable generic timing counters, for use by the CPU to time the system. The tim- 
ers are progranrmied to a particular value and coimt down each time a timing pulse is received. If a parricu- 
lar timer decrements to 0, then an interrupt is generated. The counter can be programmed to automatically 
restart the count, or wait until re-progranuned by the CPU, At any time the status of the counter can be 
read from GenCntValue, or can be reset by writing to GenCntValue register. The auto-restart is activated 
by setting the GenCntAuto register, when activated the counter restarts at GenCntStartValue. A counter 
can be stopped or started at any time, without affecting the contents of the GenCntValue register, by writ- 
ing a I or 0 to the relevent GenCntEnable register. 
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1 5.4 Implementation 

15.4.1 Definitions of I/O 



Table 50. Timers block I/O definition 





MM 




Clocks and Resets 


pdk 


1 


In 


System Clock 


prst_n 


1 


In 


System reset, synchronous active low 


tim_pulse(2:0J 


3 


Out 


Timers block generated timing pulses, each one pclk wide 

0 - l^s pulse 

1 ' 100 pulse 

2 - 10ms pulse 


CPU Interface 


Cpu_adrf6:2) 


5 


In 


CPU address bus. Only 5 bits are required to decode the 
address space for the ICU block 


cpu_dataout(31:0] 


32 


In 


Shared write data bus from the CPU 


tim_cpu_data(31 :0] 


32 


Out 


Read data bus to the CPU 


cpu^rvvn 


1 


In 


Common read/not-write signal from the CPU 


cpu_tim_sel 


1 


In 


Block select from the CPU. When cpLLfim_^/is high both 

cpu_a(Srar\6 cpu_dataout are valid 


tlm_cpu_fdy 


1 


Out 


Ready signal to the CPU. When tim^cpu^rdy is high It Indi- 
cates the last cycle of the access. For a write cyde this 
means cpu_dataout has been registered by the TIM block 
and for a read cycle this means the data on tiw cpu data is 
valid. 


tlm_cpu_borr 


1 


Out 


Bus error signal to the CPU indicating an invalid access. 


cpu.acode[1:0] 


2 


In 


CPU Access Code signals. These decode as folk)ws: 

00 • User program access 

01 • User data access 

10 ' Supervisor program access 

11 - Supervisor data access 


tifn_cpu_d€bug_valid 


1 


Out 


Debug Data vatW on tim^cpujdata bus. Active high 


Miscellaneous 


tiiTulcu_wdJrq 


1 


Out 


Watchdog Umer internipt signal to the ICU bkx* 


tim_?cujrq[2:0] 


3 


Out 


Generic timer interrupt signals to the ICU block 


tim_cpf_reset_ji 


1 


Out 


Watch dog timer system reset. 
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15.4.2 Timers sub-block partition 



CPU 



cpo^dr 




CPU. .tim..sel 




Qsu^dataout 


► 

32 


^ t"m_cpu_f<Jy 


7^ 


^ tim_cpu^(Sata 




opu_nwn 


CDU acoda 




^ tfm_cpu_befr 





^ tim CPU debug vaiM 









Jfee_run_cnt 






lree_runLdata 






free_run_wen 




> 


free run adr 







s 



Timing pulse 
generator 



wdoq unit sgl 



vwtoflLwen 



_wdoo_tim data 



wdOQ ffm cnt 



wn ttm en 



oen ttm auto 



_jen unrt set 



Qftn wf»n 



Qen.ttm data 



qen_tim_cnt 



^3^ 



nen_tlm,cnt_st_valw ^ 



tlrn^lse[2:0) 



Watchdog 
timer 



■> tim_icu_wd_lfq 



tlm_cpr_reset_n 



Generic 
timers 



-tim_lcu_liqI2:0I 



Figure 49. Timers sub-block partition diagram 



1 5.4.3 Watchdog timer 



The watchdog timer counts down from pre-programmed value, and generates a system wide reset when 
equal to one. When the counter passes a pre-programmed threshold (wdog_tim_thres) value an intemq)t is 
generated (Hmjcu_wdjrq) requesting the CPU to update the counter. Setting the counter to zero disables 
the watchdog reset. In supervisor mode the watchdog counter can be written to or read from at any time, in 
user mode access is denied Any accesses in user mode will generate a bus error. 

wdog^unlt^SQl- 

tim_piitselO) 
,tnn_pu}se(1 1 
ilnrupulse[2] 
1 



wdog_wen 
wdog_tjm_data 




^ l»fn_tetJ_wdJrq 
► tJm_cpr_reset_n 



Figure 50. Watchdog timer RTL diagram 



The counter logic is given by 
if (wdog_wen == 1) then 

wdog_tiiiucnt « wdog^tinudata 
elsi£ ( wdos_tiiiucnt 0) then 

wdog_tinucnc e wdog.tiii\_cnt 
elslf ( cnt_en == 1 ) then 



// load new data 
// count disabled 
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wdog^tinucnt — 
else 

wdog_tiiiLcnt = wdog_t indent 
The timer decode logic is 

if (( wdog^tinucnt == wdog_tin\_thr©s) AND (wdog^tinucnt i= 0 )) then 
else 

tiiiuicu_wdLirq » 0 
// reset generator logic 
if (wdog_t indent == 1) then 

tinucpr_reset_n = 0 
else 

tinucpr_reset^n = 1 



1 5.4.4 Generic timers 




The generic timers block consists of 3 identical counters. A timer is set to a pre-configured value (CenOir- 
StartValue) and counts down once per selected timing pulse {gen_unit_sel). The timer can be enabled or 
disabled at any time (gen_tim_en), when disabled the counter is stopped but not cleared. The timer can be 
set to automatically restart (genjtim^auto) after it hits zero. In supervisor mode a timer can be written to or 
read from at any time, in user mode access is determined by the GenCntUserModeEnable register settings. 
gen.unit^seJ- 



ttm_j)ulS8(0] 
tlm_pijtse[i| 
lfm^ulse[2] 
1 

flerutlm_cnc.st.value 
gen_wen 

06n_tim_data ^-^-w Logic Uecode | ^ tim_icujfq 

gen_tim_en 
oen_dm_auto ^ 

' ' ' flen_tlmjent 

Figure 51. Generic timer RTL diagram 

The counter logic is given by 
if (gen_wen == I) then 

'gen_tiin_cnt = gen_tiin_data 
elsif ({ cnt_en == 1 )AND (gen_tim_en == 1 } > then 

if ( gen_t indent == 0) then // counter may need re-starting 
if (gen_tin\_auto «b i) then 

gen_tiia_cnt = gen_tinucnt_st_value 
else 

gen_tiin_cnt « gen_tim_cnt 

else 

gen_ t iin_cnt- - 

else 

gen_tim_cnt = gen_tijii_cnt 

The decode logic is 
if <gen_tini_cnt == 1) then 

tinv^icu_irq = 1 
else 

tiin_icu_irq = 0 
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15.4.5 Timing pulse generator 

The timing pulse generator contains a general free running 64-bit timer and 3 timing pulse generators pro- 
ducing timing pulses of one cycle duration with a period of Ijis, lOO^is and 1ms. In supervisor mode the 
free running timer register can be written to or read from at any time, in user mode access is denied- The 
status of each of the \[is, 100(is and 1ms timer can be read by accessing the TlmerPulseStatus registers. 
Any accesses in user mode will result in a bus error. The status of each of the l^s, 100ns and 1ms timer 
can be read by accessing the TlmerPulseStatus register in supervisor mode. 



Free Run Timer 



fiee_run_wen • ^ 

32 

free.run.data yi-^ 

fr9e_njrL.adr 




fre«_run_cnt 



1us Timer 



Decrement 
Logic 1 us 



100US Tim sr 



pulse.lus • 



Decrement 
Logic 1 0Ous 



pulsa.lOOus ' 



10ms Timr 



Decrement 
Logic 10ms 



Compare 



pulse_1us 



-> tlm^lse(0] 



# Compare 



putse_10Ou! 



IS 

Um_pulse[1] 



Compare 



> tim_pulse[21 



o)-J 



Z' ► pu1s6.timar_$tatus 



tim_put3e[2;0}- 

Figure 52. Pulse generator RTL diagram 



15.4.5.1 Free Run Timer 



The increment logic block increments the timer count on each clock cycle. The counter wraps around to 
zero and continues incrementing if overflow occurs. When the timing register {FreeRunCount) is written 
to, the configuration registers block will set the Jree_run_wen high for a clock cycle and the value on 
free_run_data will become the new count value, for the 32 bits selected by the free_run__adr signal. If 
Jree_run_adr is 1 the higher 32 bits of the counter will be written to, otherwise the lower 32 bits arc writ- 
ten to. It is the responsibility of software to handle these writes in a sensible manner. 

The increment logic is given by 

if ( f ree_r\2n_w©n == 1) then 
if ( f ree_run_adr 1) then 

free_run_cntt63 : 32] = f ree_run_data 
else 

free_ruix_cnt(31 :0) « f ree_run_dato 

else 
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free_run_cnt 

15.4.5.2 Pulse Timers 

The pulse timer logic generates timing pulses of 1 clock cycle length and period of 1 ^, lOOjxs and 1ms. 
The logic for the 1^ timer is given by: 

// lus generator 

If (pul8e_lus_cnt 0 ) then 

pulse_lus_cnt = 159 

pul8e_lus = 1 
else 

pul3e_lus_cnt — 
pulse.lus B 0 

The logic for lQO\xs timer is given by: 
// lOOus generator 

if ( (pulse.lOOus_cnt == 0 ) AND (pulse^lus »« 1)) then 

puX8e_100us_cnt =99 

pul8O_l00us = 1 

els if (pulse_lus == 1) then 

pulse_100us_cnt — 

pulse_100u8 = 0 

else 

pulse_lOOus_cnc -- 
pulse_100us = 0 

The logic for the 10ms timer is given by: ' 
// lOxns generator 

if ( <pulse_10ms_cnt == 0 > AND (pulse_100us == 1)) then 

pulse_10xtts_cnt = 99 

pulse^lQras s i 

els if (pulse.lOOus == 1) then 

pulse_10ms_cnt — 

pulse.lOms «s 0 
else 

pulse.lOms.cnt — 
pulse.lOms = 0 

15.4.6 Configuration registers 

The configuration registers in the TIM are programmed via the CPU interface. Refer to section 1 1.4.3 on 
page 70.for a description of the protocol and timing diagrams for reading and writing registers in the TIM. 
Note that since addresses in SoPEC are byte aligned and the CPU only supports 32-bit register reads and 
writes, the lower 2 bits of the CPU address bus are not required to decode the address space for the TIM. 
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When reading a register that is less than 32 bits wide zeros should be returned on the upper unused bit(s) 
of tim^pcu^data. Table 51 lists the configuration registers in the TIM block . 



Table 51. Timers Register Map 













0x00 


WatchDogUnitSel 


3 


0x0 


Specifies the units used for the watchdog 

timer: 

0 - 1 pulse 

1 - 100 (IS pulse 

2 - 10 ms pulse 
3-pc^ 


0x04 


WatchDogTlmer 


32 


OxFFFF 
_FFFF 


Specifies the number of units to count before 
watchdog timer triggers. 


0x08 


WatchDogtntThres 


32 


0x0000 
^0000 


Specifies the threshold value below which the 
watchdog timer issues an interrupt 


OxOC-OxlO 


ReeRunCount(1 .*0J 


2x32 


OxOQOO 
^0000 


Direct access to the free running counter reg- 
ister. 

Bus 0 - Access to bits 31 -0 
Bus 1 - Access to bits 63-32 


0x14 to 0x1 C 


GenCntStartVa[ue[2.*0] 


3x32 


0x0000 
_0000 


Generic timer counter start value, number of 
units to count before event 


0x20 to 0x28 


GenCntValue|2:0] 


3x32 


0x0000 
_0000 


Direct access to generic timer counter regis- 
ters 


OxaCto 0x34 


GenCntUnttSe![2.-0] 


3x2 


0x0 


Generic counter unit select. Selects the timing 
units used with corresponding counter: 

0 - 1 pulse 

1 - 1 00 /IS pulse 

2 - 10 ms pulse 


0x38 to 0x40 


QenCntAuto[2:0] 


3x1 


0x0 


Generic counter auto re-start select When 
high timer automatically restarts, othenvise 
timer stops. 


0x44 to 0x4C 


GenCntEnable[2:0] 


3x1 


0x0 


Generic counter enable. 

0 • Counter disabled 

1 - Counter enabled 


OxSO 


GenCntUserModeEnable 


3 


0x0 


User Mode Access enable to generic timer 
configuration register. When 1 user access is 
enabled. 

Bit 0 - Generic timer 0 
Bit 1 - Generic timer 1 
Bit 2 - Generic timer 2 


0x54 


DebugSelect 


6 


0x00 


Debug address select. Indicates the address 
of the register to report on the Um_cpu_data 
bus when it is not otherwise being used. 


Read Only Registers 


0x58 


PulseTifTierStatus 


24 


0x00 


Current pulse timer values, and pulses 

6:0 - 1 us timer count 

7 • 1 us pulse 

14:8 - lOOus timer count 

15 - lOOus pulse 

22:16- 10ms timer count 

23 • 10 ms pulse 
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i 5.4.6. i Supervisor and user mode access 

The configuration registers block examines the CPU access type {cpu^acode signal) and determines if the 
access is allowed to that particular register, based on configured user access x^gisters. If an access is not 
allowed the block will issue a bus error by asserting the tim_cpu_berr signal. 

The timers block is fully accessible in supervisor data mode, all registers can written to and read from. In 
user mode access is denied to all registers in the block except for the generic timer configuration registers 
that are granted user data access. User data access for a generic timer is granted by setting corresponding 
bit in the GenCntUserModeEnable register. This can only be changed in supervisor data mode. If a partic- 
ular timer is granted user data access then all registers for configuring that timer will be accessible. For 
example if timer 0 is granted user data access the GenCntStartValuefO] , GenCntUnitSelfOJ, GenCn- 
UutofOJ, GenCntEnabiefOJ and GenCntValue[OJ registers can aU be written to and read from without any 
restriction. 

Attempts to access a user data mode disabled timer configuration register will result in a bus error. 

Table 52 details the access modes allowed for registers in the TIM block. In supervisor data mode all reg- 
isters are accessable. All forbidden accesses will result in a bus error (tint^cpujyerr asserted). 



Tabfe 52. TIM supervisor and user access modes 









0x00 


WatchDogUnUSel 


Supervisor data mode only 


0x04 


WatchDogTimer 


Supervisor data mode onfy 


0x08 


WatchDogtnfThres 


Supervisor data mode only 


OxOOOxlO 


FreeRunCount 


Supervisor data mode onfy 


0x14 


GenCntStartVa(ue(0] 


G enCntUserMode£nabre[0] 


0x18 


GenCntStarlValue[11 


GenCntUserModeEnabte[1 ] 


0x1 C 


Gen CntStar1Value[2j 


GenCntUserModeenaWe[2] 


0x20 


GenCntVaIue[0) 


GenCntUserModeEnab(e[0] 


0x24 


GenCntValue[1I 


GenCntUserModeEnabfeH] 


0x28 


GenCntValue[2) 


GenCntUserModeEnabie[2] 


0x2C 


GdnCntUnltSel[0] 


GenCntU$erModeEnab(e[0] 


0x30 


GenCntUnitSeI(1] 


GenCntUserModeEriable[1 ] 


0x34 


GenCntUnitSel(2] 


GenCntUserModeEnable[2] 


0x38 


GenCntAiJto{0} 


GenCntUserModeEnabte(0] 


0x3C 


GenCntALito[1] 


GenCntUs6rModeEnabIe[1 ] 


0x40 


GenCntAuto(2] 


GenCntliserMQdeEnabIe[2] 


0x44 


GenCntEnabIe(01 


GenCntUserModeEnable(0] 


0x48 


GenCntEnable[1] 


GenCmUserModeEnaWell ) 


0x4C 


GenCntEnab{e[2] 


GenCntUserModeEnable(2] 


0x50 


GenCntUserModeEnabre 


Supervisor data mode only 


0x54 


DebugSetect 


Supervisor data mode only 


0x58 


PulseTlmerStatus 


Supervisor data mode only 
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16 Clocking, Power and Reset (CPR) 

The CPR block provides all of the clock, power enable and reset signals to the SoPEC device. 



16.1 POWERDOWN MODES 



The CPR block is capable of powering down certain sections of the SoPEC device. When a section is pow- 
ered down (i,c. put in sleep mode) no state is retained, the CPU must re-initialize the section before it can 
be used again. The exact powerdown mechanism is undefined and is technology dependent. 
For the purpose of powerdown the SoPEC device is divided into sections: 



Table 53. Powerdown sectioning 







Prim Engine Pipeline Subsystem 
(Section 0) 


CDU 


CRJ 




LBD 




SFU 




TE 




TFU 




HCU 




DNC 




DWU 




LLU 




PHI 


CPU-DRAM (Section 1) 


DRAM 




CPU/MMU 




DIU 




TIM 




ROM 




LSS Interface 


Comma Subsystem (Section 2) 


USB 




IS] 




DMA Ctri 




GPIO 




PSS 




ICU 



16.1.1 



Sleep mode 

Each section can be put into sleep mode by setting the corresponding bit in the SleepModeEnable register. 
To re-enable the section the sleep mode bit needs to be cleared and then the section should be reset by 
wnting to the relevant bit in the ResetSection register. Each block within the section should then be re-con- 
figurcdbythc CPU. 

If the CPU system is put into sleep mode, the SoPEC device will remain in sleep mode until a system level 
reset is initiated from the reset pin, or a wakcup reset by the SCB block as a result of activity on either the 
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USB or ISI bus. If all sections are put into sleep mode, then only a system level reset initiated by the reset 
pin will re-activate the SoPEC device. ^ 

Like all software resets in SoPEC the ResetSecthn register is active-low i.e. a 0 should be written to each 
bit position rcquinng a reset. The ResetSection register is self-reseting. 



16.2 Reset SOURCE 



The SoPEC device can be reset by a number of sources. When a reset from an internal source is intiated 
the reset source register (ResetSrc) stores the reset source value. This register can then be used by the CPU 
to determine the type of boot sequence required. 



16.3 Clock RELATIONSHIP 



The crystal oscillator excites a 32MHz crystal through the xtalin and xtalout pins. The 32MH2 output is 
If oo<^!.^^^ *° frequency of 960MH2. The 'master dock is then divided to pro- 

duce 320MH2 clock {clk32Cf), 160MHz clock {clkieCf), 106MHz clock iclklOS) and 48MH2 (clk48) clock 
sources. 

^v.^^l]nfT}^^ f ""^"^^ ^"""^ ^^^^^ '^^ relationship of interna! clocks 

clk320 clkI06. clk48 and clki60 to xtalin wiU be undefined The clock tree generation should create inser- 
tion delays so as to compensate for the phase difference of the clocks leaving the PLL. At the output of the 
clock block, the skew between ^Tich pclk domain (pclk^section[3:0J and jclk) should be within skew toler- 
ances of their respective domains (defined as less than the hold time of a D-type flip flop). 

The skew between doclk and phiclk should also be less than the skew tolerances of their respective 
domams. ^ 

The usbclk is derived from the PLL output and has no relationship with the other clocks in the system and 
is considered asynchronous. 
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There is no skew requirement between the pclk domains and the doclk and phiclk domains, they are con- 
sidered essentially asynchronous to each other. 



1.04ns 



PU. Master Clock 



fMnnnjinnnnniiiinjuin^^ 



Clk320 



dodk 



cikieo 



pdk 

jdk 



dkioe 



phidk 




cikdaOPULpAtta shift 



, ^ H doclk insertion delay 



1 



H ciki GO PLL pfUM shift 



dkl 06 PLLphasa shift 



i_r 



n I — L_r 

i pdWJdk inaortion delay 



^ phidk insertion delay 

Figure 53^ SoPEC clock relationship 
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1 6,4 Implementation 
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1 6.4.1 Defm itions of I/O 



Table 54. CPR UO definition 









aocks and Resets 


xtalin 




In 


Crystal inputs direct from lO pin. 


xtalout 




Out 


Crystal output, direct to 10 pin. 


pdk_8ection[2:0] 




Out 


System docks tbr each section 


phidk ' 




Out 


Printhead interface dock (doc1k/3) for the PHI bk>ck 


doclk 




Out 


Data out dock (2x pdk) for the PHI block 


jclk 




Out 


Gated verston of system dock used to clock the JPEG decoder 
core in the COU 


usbdk 




Out 


USB dock at 3 times the crystal input frequency, nominally at 48 
Mh2 


jclk_enable 




In 


Gating signal torjdk. 


reset_n 




In 


Reset signal from the reset^n pin 


usb_cpr_reset.n 




In 


Reset signed from the USB block 


isLcpr_reset_n 




In 


Reset signal from the iSI block 


tim_cpr_reset_n 




In 


Reset signal from watch dog timer. 


prst_n_section(2:0] 




Out 


System resets for each section, synchronous active k>w 


phirst_n 




Out 


Reset for PHI Mock, synchronous to p/i/oT/r 


dorst.n 




Out 


Reset for PHI block, synchronous to doclk 


IrsXjn 




Out 


Reset for JPEG decoder core in CDU block, synchronous to jdk 


usbrst.n 




Out 


Reset for the USB bfock. synchronous to usbdk 


Test Input 


test_clk 




In 


Test dock direct from external pin, for use in production test (scan 
test) 


test_enable 




In 


Test enable. Direct from external pin. When high production test 
mode Is enabled. 


CPU tntefface 


cpu_adrt3:2J 


2 


In 


CPU address bus. Only 2 bits are required to decode the address 
space for the CPR bfock 


cpu_dataout{31:0] 


32 


In 


Shared write data bus from tfie CPU 


cpr_cpu.dataf31 .-OJ 


32 


Out 


Read data bus to the CPU 


cpu^fwn 


1 


In 


Common read/not-write signal from the CPU. 


cpu_cpf_sel 


1 


In 


Bfock select from the CPU. When cpu^cpr^sefls high both 
qpiJLadrand cpu^dataout are valid 


cpr.cpu_rdy 


1 


Out 


Ready signal to the CPU. When cpr_(^u_rdy \s high it indicates 
the last cyde of the access. For a write cyde this means 
cpu_dataoi/f has been registered by the block and for a read cyde 
this means the data on cpr_cpu^data is valid. 


cpr_cpu_berr 


1 


Out 


Bus error signal to the CPU indrcating an invalid access. 


cpu.acode[1 :0] 


2 


In 


CPU Access Code signals. These decode as folfows: 

00 - User program access 

01 - User data access 

10 - Supervisor program access 

1 1 - Supervisor data access 


q3r_cpu_debug_valid 


1 


Out 


Debug Data valid on cpr_cpu_data bus. Active high 
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Table 54. CPR I/O definition 



Miscellaneous 



pwr_sleep_iTiode[2.-01 {3 | put | Steep mode section select 



16.4.2 Configuration registers 

Tlie configuration registers in the CPR are programmed via the CPU interface. Refer to section 1 1 4 on 
page 69 for a description of the protocol and timing diagrams for reading and writing registers in the CPR 
Note that since addresses in SoPEC are byte aligned and the CPU only supports 32-bit register reads and 
wntes, the lower 2 bits of the CPU address bus are not required to decode the address space for the CPR. 
When reading a register that is less than 32 bits wide zeros should be returned on the upper unused bit(s) 
of cpr_pcu_data. Table 55 lists the configuration registers in the CPR block. 

H'm™™'''*''''' supervisor data mode accesses (i.e. cpu_acode[l O] = 

SUPERVISORJ3ATA ). All other accesses will result in q>r_cpu_bar being assetted . 



Table S5. CPR Register Map 




0x00 



0x04 



0x08 



OxOC 



SleepModeEnabfe 




ResetSrc 



ResetSection 



OebugSelect 



6 



PLL Control (Asynchronous reset registers) 



0x0 



0x0^ 



0x7 



0x00 



Sreep Mode enable, wtien high a sectfon of logic 
has is powerdown. Each bit controls a section 



Reset Source register. Indicating the source of 

the last reset 

Bit 0 - External Reset 

Bit 1 - USB wakeup reset 

Bit 2 • ISI wakeup reset 

Bit 3 -Watchdog timer reset 



Active-low synchronous reset for each section, 
self-resetting. 



Debug address select. Indicates the address of 
the register to report on the cpr^cpu_data bus 
when it is not otherwise being used. 



0x10 



0x14 



0x18 



0x1 C 



PLLTuneBits 



PLLRangeA 



PLLRangeB 



PLLMultipUer 



10 



0x23 E 



OxP 



0x7 



0x25 



PLL tuning bits 



PLLOUT A frequency selector (detaults to 
600Mh2to1250Mh2) 



PLLOUT B frequency selector (defaults to 
600Mhzto12S0Mh2) 



PLL mumptier selector, defaults to refc/kx 20 



a. Reset value depends on reset source. Extenuil reset shown. 
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16.4.3 CPR Sub-block partition 



tescenable- 
tascclk- 



xtatm ^ 



xtalout ' 



testLenable - 



Clock Gertemtor 



Crystal 
Osdllator 






► 


PLL 







jcflc_enab(e 



pwr.8ieep_mod»^ 



e 

1 ^ 
=5. 2 



reset_n - 
usb_cpf_resot„n - 
feLcpf_reset_n - 
tim.cpr_roseCn - 



dk320 



dk48 



dk160 



Gate Enat)le 
Logic 



> Clock 
ill^ Gate 



9ate_dam 



Reset 
Logic 



Clock 
Gate 



>| Clock 
Gate 



Ck)ck 
Gate 



Clock 
(41 ^[ Gate 



Clock 
Gate 



aock 
Gate 




Idk 



I 3 



Configuratton registers 



3 



i 



/ 32 



I 



docfX ^ 

rese1_dom(OJ ^ 


Reset 
Sync 


phtolK ^ 

reset_dom[l] ^ 


Reset 
Sync 


usbdk ^ 

reset_dom(2) ^ 


Reset 
Sync 


pel K_section(Oj— # 
reset_dofn(3l ^ 


Reset 
Sync 


pclk,sectk>n(t>— ► 
reset_domr4l ^ 


Reset 
Sync 


pclk_sectk>n[2} — ^ 
reset domfSl ^ 


Reset 
Sync 


idk ► 

reset„domf6] ^ 


Reset 
Sync 




CPU 



Figure 54. CPR block partition 
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16.4.4 Sync reset 



The reset synchronizer retimes an asynchronous reset signal to the clock domain that it resets. The circuit 
prevents the inactive edge of reset occurring when the clock is rising 



pdkf 
reset.dom 



prst_n 



J 



reset.dom - 



aynchjonizer 



Figure 55. Reset synchronizer logic 



1 6.4.5 Reset generator logic 

The reset generator logic is used to determine which clock domains should be reset, based on configured 
reset values ireset_section_n\ the external reset {reset^nX watchdog timer reset itim_cpr_reset_n) and 
resets from the SCB block (isLcpr_reset_n, usbjcpr_reset_n). The reset direct from the lO pin (reset^n) is 
synchronized and de-glitched before feeding the reset logic. 

Resets from the SCB block reset everything except its own section (section 2), this allows data to be stored 
in the PSS block for use after a SCB powerup initiated reset 

Tabre 56. Reset domains 





reset_dom[0] 


doclk domabi 


reset_dom(l) 


ptildk domain 


reset_dom[2] 


usbdk domain 


reset.dom[3] 


Section 0 pdk domain 


reset_<Jorn[4] 


Section 1 pclk domain 


reset_dom[5] 


Scctton 2 pcll< domain 


reset_dom(6] 


jdk doniain 



The logic is given by 

if (reset^n == 0) then 

resec,doin{6:0J = 0x00 // reset everything 

reset_src(3 ;0] = OxOl 
els if (usb_cpr_reset_n 0) then 

reset_dom[6:0] = 0x20 // all except coimns domain 

reset_srcl3 :03 » 0x02 
els if (isi_cpr_reset_n 0) then 

reset_dom(6:0] = 0x20 // all except comns domain 

reset_src(3 :0) * 0x04 
els if (tiin_cpr_reset_n 0) then 

resot_dQinC6:0] = 0x00 // reset everything 

reset.srcO :0] = 0x08 
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else 

// propagate resets froja reset section register 

reset_<ioin[5 :0) = Ox3F 

if (reset_section_ntO) == 0) then 

reaet_dorat31 = 0 
if (reset_section_nri3 == 0) then 

reset_domt4 J s 0 
if (reset_section_n(2} == 0) then 

reset_dom(5] « 0 



1 6.4.6 Gate enable logic 

The gate enable logic is a combinational logic block used to generate gating signals for each of SoPECs 
clock domains. The gate enable (gate_domain) is generated based on the configured sleep^ode_en and 
the intemaily generated Jclk_enable signal. 

The logic is given by 

// clock gating for sleep jnodes 
gate_dom[5:3) = 0x7 // default to on 
for (i=0 ;i < 3 ; i^+) ( 

if (sleep_n»ode_enCi] == l) then 
gate_dom( i+3 3 = 0 
pwr__sleep.inode ( 1) == 1 

) 

// jclk and remaining 
gate_dom[2 :0) « 0x7 
gate_domC6] = -( jclk_enable) 



16.4.7 Clock gate logic 

The clock gate logic is used to safely gate clocks without generating any glitches on the gated clock. When 
the enable is high the clock is active otherwise the clock is gated. 

»f<^-clk I I I I I I I I 

gate.dom | ~ 

gate_dom_retimed^ 1 | 

gate_ck)ck J \ | I | I 



gate^dom- 



src_clk- 



gate_dom_retimed 



»^9ate_clock 



Figure 56. Clock gate logic diagram 
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1 6.4.8 Clock generator Logic 

The clock generator block contains the PLL. crystal oscUlator, clock dividers and associated control and 
. test logic. The PLL VCO frequency is at 960Mh2 locked to a 32 Mhz refclk generated by the crystal oscil- 
lator. In test mode the xtalin signal can be driven directly by the test clock generator, the test clock will be 
reflected on the refclk signal to the PLL. 

test_enable^ — 



xtalin — 
xtalout ^ 



Crystal 
OscllJator 



refclk 



pll^range^a 
pU,fanoe_b 
pILmuItplier 
pILtune 



prst_n 




»^ 011(320 
"dkioe 



• clk48 



Figure 57. PLL and Clock divider logic 



16.4,a,i dock divider A 

The clock divider A block generate the 320Mh2. 160Mh2 and 106Mhz clocks from the input 320Mh2 
clock (pil^outb) generated by the PLL. The divider flips flops are asynchronously reset by the prst^n sig- 
nal. The divders are enabled only when the PLL has acquired lock as indicated by the plljoclc signal, 

16.4.8.2 Cloci^ divider B 

The clock divider B block generate the 48Mhz clock from the input 96Mhz clock (pli^outa) generated by 
the PLL. The divider flips flops are asynchrously reset by the prst_n signal. The divders are enabled- only 
when the PLL has acquired lock as indicated by the pUJocIc signal. 
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17 ROM Block 



17.1 Overview 

The ROM block interfaces to the CPU bus and contains the SoPEC boot code. The ROM block consists of 
the CPU bus interface, the ROM macro and the ChipID macro. The current ROM size is 16 KBytes imple- 
mented as a 4096 x32 macro. Access to the ROM is not cached because the CPU enjoys fast (no more than 
one cycle slower than a cache access), unarbitrated access to the ROM. 

Each SoPEC device is required to have a unique ChipID which is set by blowing fuses at manufacture. 
IBM*s 300mm ECID macro is to be used to implement the ChipID and this offers 112-bits of laser fuses. 
The exact number of fiise bits to be used for the ChipID will be determined later but all bits are made 
available to the CPU. The ECID macro allows all 112 bits to be read out in parallel and the ROM block 
will make all 1 12 bits available in the FuseChipID[N] registers which are readable by the CPU in supervi- 
sor mode only. 

1 7.2 Boot operation 

The are two boot scenarios for the SoPEC device namely after power-on and after being awoken from 
sleep mode. When the device is in sleep mode it is hoped that power will acmally be removed from the 
DRAM, CPU and most other peripherals and so the program code will need to be freshly downloaded each 
tinric the device wakes up from sleep mode. In order to reduce the wakeup boot time (and hence the per- 
ceived print latency) certain data items are stored in the PSS block (see section 18). These data items 
include the SHA-l hash digest expected for the program(s) to be downloaded, the master/slave SoPEC id 
and some configuration parameters (currently TBD). All of these data items are stored in the PSS by the 
CPU prior to entering sleep mode. The SHA-1 value stored in the PSS is calculated by the CPU by 
decrypting the signature of the downloaded program using the appropriate public key stored in ROM. This 
compute intensive decryption only needs to take place once as part of the power-on boot sequence - subse- 
quent wakeup boot sequences will simply use the resulting SHA-1 digest stored in the PSS. Note that the 
digest only needs to be stored in the PSS before entering sleep mode and the PSS can be used for tempo- 
rary storage of any data at all other times. 

The CPU is expected to be in supervisor mode for the entire boot sequence described by the pseudocode 
below. Note that the boot sequence has not been finalised but is expected to be close to the following: 

if (ResetSrc == 1) then // Reset was a power-on reset 

conf igure.aopec // need to configure peris (USB, ISI, DMA, ICU etc.) 
// Otherwise reset was a wakeup reset so peris etc. were already configured 
PAUSE: wait until IrqSemaphore !» 0 // i.e. wait until an interrupt has been serviced 
if ( ZrQSexnaphore as DHAChanOHsg) then 

parse_insg(DMAChanOMsgPtr) // this routine will parse the message and take any 

// necessary action e.g. programming the DMAChannell registers 
elsif {IrqSemaphore == DMAChanlMsgJ then // program has been downloaded 

CalculatedHash = gen_shal (ProgramLocn« ProgramSize) 

if (ResetSrc == 1) then 

ExpectedKash = sig^decrypt ( Programs ig) 

else 

ExpectedHash = PSSHash 
if (ExpectedHash == CalculatedHash) then 

jmpCPrgraraLocn) // transfer control to the downloaded program 
else 

send_host_msg( "Program Authentication Failed") 
goto PAUSE: 

elsif (IrqSemaphore == timeout) then // nothing has happened 
if (ResetSrc 1) then 
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sleep_mode() // put SoPEC into sleep nu>de to be woken up by USB/ISI activity 
else // wc wore woken up but nothing happened "!>»/i*i activity 

reset_sopec (PowerOnlteset ) 

else 

goto PAUSE 

TJie boot code places no restrictions on the activity of any programs downloaded and authenticated by it 
other than those imposed by the configuration of the MMU i.e. the principal function of the boot code is to 
authentw^te that any programs downloaded by it arc from a trusted source. It is the responsibility of the 
downloaded program to ensure that any code it downloads is also authenticated and that the system 
rMnains secure. The dowiJoaded program code is also responsible for setting the SoPEC ISIId (see section 

iilt^oif. °! S« "SoPEC Security Overview" docu- 

ment [9] for more details of the SoPEC security features. 



17.3 Implementation 



1 7.3.1 Definitions of I/O 

Table 57. ROM Block I/O 



















L^iocKs ana Resets 




prst_n 


1 


In 


Global reset. Synchronous to pdk, active low. 


pdk 


1 


In 


Global dock 


CPU Interface 




cpu_adr(1S:23 


14 


fn 


CPU address bus. Only 14 bits are required to decode the address 

space for this block. 


rom_cpu_data[31:0) 


32 


Out 


Read data bus to the CPU 


cpu_rwn 


1 


In 


Common rcad/not-write signal from the CPU 


cpu_acbcre[1:0] 


2 


In 


CPU Access Code signals. These decode as follows: 

00 • User program access 

01 - User data access 

10 - Supervisor program access 

1 1 * Supervisor data access 


cpu_rom_sel 


1 


In 


Block select from the CPU. When cpu.rD/7?_$e/is high cpu adris 
valid ~ 


«>m_cpu_rdy 


1 


Out 


Ready signal to the CPU. When n3m_cpLL/t(y is high It Indicates 
the last cyde of the access. For a read cyde this means the data on 
rom^cpu_data is valid. 


ronn_cpu_berr 


1 


Out 


ROM bus error signal to the CPU indtoating an Invalid access 



17.3.2 Configuration registers 



The ROM block will only allow read accesses to the FuseChipID registers with supervisor data space per- 
missions (i.e. cpu^acode[l:0] = 11). All other accesses of the FuseChipID registens will resuUin 
rom_cpuJ>err being asserted. The ROM blocic allows all read accesses to the ROM itself (i.e supervisor or 
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user, data or program accesses). The CPU subsystem bus slave interface is described in more d<^l in sec- 
tion 9.4.3. 

Table 58. ROM Block Register Map 









0x8004 


PuseChiplD[N] 


32 


n/a 


Value of correspondtng fuse bits. (Read oaly) 



1 7.3.3 Sub-Block Partition 



IBM offer two variants of their ROM macros; A high performance version (ROMHD) and a low power 
vefsion (ROMLD). It is likely that the low power version will be used unless some implementation issue 
requires the high performance version. Both versions offer the same bit density. The sub«block partition 
diagram below does not include the clocking and test signals for the ROM or ECID macros. The CPU sub- 
system bus interface is described in more detail in section 1 1 .4.3. 



ROM Macro 
4096 X 32 



ronuadr 



12^ 



rom^data 32^ 



IBM 300mm ECID macro 

— — — ■y 

- PUSEooo 



CZ3- 



I 
I 

FUS6001 

— i; — 

i| 

k 

k 
<l 

Fuscm 



fuse_data 



Juse_reo_adr 



CPU Bus 
Internee 



4- 
















► 


*- 


► 



cpu.rom^sel 
cpu_rwr^ 
rom_cpu_rdy 



^ rom_cpu_berr 



Figure 58. Sufo-block parb'tion of the ROM block 

17.3.4 Sub-block signal definition 

Table 59. ROM Block Internal signals 



Clocks and Resets 



pfsl_n 



J j Gtobal reset. Synchronous to pdk, active low. 
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Table 59. ROM Block Internal signals 









pdk 1 


Global dock 


Internal Signals 




fom_adrf11:0] 


12 


ROM address bus 


rom_8el 


1 


Select signal to the ROM macro Instructing it to access the kx:ation 

at rom_adr 


rom_oe 


1 


Output enabfe signal to the ROM block 


rom_data[31:0] 


32 


Data bus from the ROM macro to the CPU bus intefface 


fom^dvaHd 


1 


Signal from the ROM macro fndlcaling that the data on mm_(/ata is 
valid tor the address on rom_adr " 


fuse_data(31:0) 


32 


Data from the FuseChipfD[N] register addressed by fuse reg adr 


fuse_refl_adrtl :0] 


2 


Indicates whteh of the FuseChipIO registers is being addressed 
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18 Power Safe Storage (PSS) Block 



18.1 Overview 



The PSS block provides 1 28 bytes of storage space that will maintain its state when the rest of the SoPEC 
device ,s u, sl«p mode. The PSS is expected to be used primarily for the storage of decrypted signatures 
associated with downloaded programmed code but it can also be used to store any information that needs 
K ^"^^ f " ^*f - details). Note that the signature digest only needs to be stored in 

the PSS before entering sleep mode and the PSS can be used for temporary storage of any data at all other 
times. 

^^'iu oco'^* ""^i* *® ^""^^ ^^"^ information it will need on exiting sleep mode 
in the PSS. On emerging from sleep mode the boot code in ROM will read the ResetSno register in the CPR 
block to <l«cimme which reset source caused the wakeup. The reset source information indicates whether 
or not Ae PSS contains valid stored data, and the PSS data determines the type of boot sequence to exe- 
^' power-on boot sequence should be performed (e.g. the printer driver has been 
updated) then this is simply achieved by initiating a full sofhvare reset. 



18.2 Implementation 



The storage area of the PSS block will be implemented as a 128-byte register array. The array is located 
from PSS.base through to PSS_base+0x7F in the address map. TTie PSS block will only allow read or 
wnte accesses with supervisor data space permissions (i.e. cpu_acode[I:OJ = 1 1). All other accesses will 
result mpss cpu_berr being asserted. The CPU subsystem bus slave interface is described in more detail 
in section 11.4.3. 
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18.2.1 Definitions of I/O 



Table 60. PSS Block I/O 









Clocks and Resets 

prst_n 
pdk 


1 
1 


(n 
in 


j Global reset. Synchronous to pdk, active low. 

j Global dock 


CPU Interface 
cpu_adr(6:2) 

cpu_dataout[31.*0] 


5 
32 


In 

In 


CPU address bus. Only 5 bits are required to decode the address 
space for this Wock. 


pss_cpu_data(31:0] 
cpu^rwn 


32 
1 


Out 
In 


Shared write data bus from the CPU 

Read data bus to the CPU " 


cpu.acocfe[1.'0] 


2 


In 


Common read/not-write signal from the CPU 

CPU Access Code signals. These decode as.follows: 

00 - User program access 

01 - User data access 

10 - Supenrtsor program access 

1 1 • Supervisor data access 


cpu_j)8s.&el 


1 


In 


Block select from the CPU. When cpu^ss^setm high both cpu adr 
and cpLL_dataout are valid 


pss_cpu_rcfy 
pss_cpu_berr 


1 
1 


Out 
Out 


Ready signal to the CPU. When pss_cpu_fdyls high it Indicates the 
last cyde of the access. For a read cyde this means the data on 
pss^cpu^data is valid. 

PSS bus error signal to the CPU lndk:ating an invalid access 
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19 Low Speed Serial Interface (LSS) 



19.1 Overview 



The Low Speed Senal Interface (LSS) provides a mechanism for the internal SoPEC CPU to communicate 
with external QA chips via two independent LSS buses. The LSS communicates through the GPIO block 
to the QA chips. This allows the QA chip pins to be reused in multi-SoPEC environments The LSS Mas- 
ter system-level interface is illustrated in Figure 59. Note that multiple QA chips are aUowed on each LSS 
bus. 



CPU 



CPU sub-syslem bus 



LSS Master 



SoPEC 

LSS bus 0 



GPIO 



QA Chip 0 



QAChipl 



LSS busX 



QA Chip 2 



QAChlp3 



Figure 59. LSS master system-leve J interface 



19-2 QA COMMUNICATION 



The SoPEC data interface to the QA Chips is a low speed, 2 pin, synchronous serial bus. Data is trans- 
lerrea to the QA chips via the lss__data pin synchronously with the iss^clk pin. When the Iss^clk is high the 
data on hs data is deemed to be valid Only the LSS master in SoPEC can drive the Iss^cik pin, this pin is 
an input only to the QA chips. The LSS block must be able to interface with an open-collector pull-up bus 
This means that when the LSS block should transmit a logical zero it will drive 0 on the bus, but when it 
Should transmit a logical 1 it will leave high-impedance on the bus (i.e. it doesn't drive the bus). If all the 
agents on the LSS bus adhere to this protocol then there will be no issues with bus contention. 
The LSS block controls all communication to and from the QA chips. The LSS block is the bus master in 
ail cases. The LSS block mterprets a command register set by the SoPEC CPU, initiates transactions to the 
QA chip in question and optionally accepts return data. Any return infonnation is presented through the 
configuration registers to the SoPEC CPU. The LSS block indicates to the CPU the completion of a com- 
mand or the occurrence of an error via an interrupt. 



19.2.1 Start and stop conditions 



All ^"Tussions on the LSS bus are initiated by the LSS master issuing a START condition and termi- 
nated by the LSS master issuing a STOP condition. START and STOP conditions are always generated by 
the LSS master. As illustrated in Figure 60. a START condition corresponds to a high to low Lnsition on 



Doc; SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 182 



SoPEC : Hardware Design 



J3 



lss_data while lss_clk is high. A STOP condition corresponds to a low to high transition on bs_data while 
lss_cik IS high. 



Iss.data 



Iss.clk 



\ 



Z 



J V 





/ 








/ 


p 







START 
CONDITION 



STOP 
CONDITION 



Figure 60. START and STOP conditions 



19.2.2 Data transfer 

Data is transferred on the LSS bus via a byte orientated protocol. Bytes are transmitted serially Each byte 
IS sent most significant bit (MSB) first through to least significant bit (LSB) last. One clock pulse is gener- 
ated for each data bit transferred Each byte must be followed by an acknowledge bit 
TTie data on the Iss.data must be stable during the HIGH period of the lss_clk clock. Data may only 
change when Iss^clk is low. A transmitter outputs data after the faUing edge oriss^dk and a receiver inputs 
the data at the nsmg edge of bs^clk. This data is only considered as a valid data bit at the next lss_cik fall- 
ing edge provided a START or STOP is not detected in the period before the next iss elk falling edge All 
clock pulses are generated by the LSS block. The transmitter releases the Iss^data like (high) during the 
acknowledge clock pulse (ninth clock pulse). The receiver must pull down the Iss^data line during the 
acknowledge clock pulse so that it remains stable low during the HIGH period of this clock pulse. 
Data transfers follow the fonnat shown in Figure 61. The first byte sent by the LSS master after a START 
contoion is a primary id byte, where bits 7-2 form a 6-bit primary id (0 is a global id and will address all 
QA Chips on a particular LSS bus), bit 1 is an even parity bit for the primary id, and bit 0 forms the read/ 
wnte sense. Bit 0 is high if the following command is a read to the primary id given or low for a write 
command to that id. An acknowledge is generated by the QA chip(s) corresponding to the given id (if such 
a chip exists) by driving the Iss^data line low synchronous with the LSS master generated ninth Iss elk 



^-^^ T \: r >'"7-iy^ Ack / Ybiis?- ^ bkO \ Ack / YwLV.'ij l bitO )( Nacfc \ ! / t 

1 • t « 

t « 1 * 

iss_cuc T]\j^-\y^u^-\y7\j;^ 



-I I I I I L. 

START IDbytt(7:IJ RJW ACK 
condidoQ 



-It I L. 



DATA ACK DATA 

Figure 61. LSS transfer of 2 data bytes 



ACK STOP 



19.2.3 Write procedure 



The protocol for a write access to a QA Chip over the LSS bus is illustrated in Figure 63 below. The LSS 
master m SoPEC initiates the transaction by generating a START condition on the LSS bus. It then trans- 
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imts the pnmary id byte with a 0 in bit 0 to indicate that the following command is a write to the primary 
Id. An acknowledge is generated by the QA chip corresponding to the given primary id. TTie LSS master 
will clock out M data bytes with the slave QA Chip acknowledging each successful byte written. Once the 
slave QA chip has acknowledged the M* data byte the LSS master issues a STOP condition to complete 
the transfCT. The QA chip gathers the M data bytes together and interprets them as a command See QA 
Oup Interface Specification for more details on the format of the commands used to communicate with 
the QA chip[8]. Note that the QA chip is free to not acknowledge any byte transmitted. The LSS master 
should respond by issuing an interrupt to the CPU to indicate this eiror. The CPU should then generate a 
STOP condition on the LSS bus to gracefiiUy complete die transaction on the LSS bus. 









ByteO 




ByteM-1 


ByteM 






s 


IDbyicf7:l} 


0 1 


Data(S} 




♦ 


HI 


Daza(8) 




P 



S s Start conditica 
A = Ack 
NsNack 
P = Stop condition 
Shaded bits driven by slave 



Figure 62. Example of LSS mite to a QA Chip 



19.2.4 Read procedure 

The LSS master in SoPEC initiates the transaction by generating a START condition on the LSS bus It 
then transmits the primary id byte with a 1 in bit 0 to indicate that the following command is a read to the 
pnmazy td. An acknowledge is generated by the QA chip corresponding to the given primary id The LSS 
master releases the Iss^data bus and proceeds to clock the expected number of bytes from the QA chip 
with the LSS master acknowledging each successful byte read. The last expected byte is not acknowledged 
by the LSS master. It then completes the transaction by generating a STOP condition on the LSS bus See 
QA Chip Interface Specification for more details on the format of the commands used to communicate 
with the QA chip[8]. 




S = Stan condition 
A-Ack 
N=Nack 
P = Stop condition 
Shaded 6it$ driven by slave 



Figure 63. Example of LSS read from QA Chip 
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19.3 Implementation 



A block diagtam of the LSS master is given in Figure 64. It consists of a block of configuration registers 
that are programmed by the CPU and two identical LSS master units that generate the signalling protocols 
on the two LSS buses as well as intetnipts to the CPU. The CPU initiates and terminates transactions on 
the LSS buses by writing an appropriate command to the command register, writes bytes to be transmitted 
to a fifo and reads bytes received firom a fifo. and checks the sources of interrupts by reading status i^s- 
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Figure 64. LSS block diagram 
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19.3.1 Definitions of lO 



Table 61. LSS lO pins deflnitrons 













pctk 


1 


In 


1 System Clock 


prst_n 


1 


In 


1 System reset, synchronous active k>w 


CPU Iffiterface 






1 


In 


Common read/not-write signal from the CPU 


cpu_adr(7:2) 


5 


In 


CPU address bus. Only 6 bits are required to decode the 
address space for this block 


cpu_dataoutf31 :0) 


32 


In 


Shared write data bus from the CPU 


cpu_acode[1:0] 


2 


In 

r 


CPU access code signals. 

cpu_acode[0] - Program (0) / Data (t) access 

cpu_acode[l] • User (0) / Supervisor (1) access 


cpu_lss_8el 


1 


In 


Btock select from the CPU. When cpuL/s3L.se/ls high both 
cpt/_atf/-and cpu_dataout ese valid 


lss_cpu_rcfy 


1 


Out 


Ready signal to the CPU. When /5S.cpu_rdy is high it indfcates 
the last cycle of the access. For a write cycle this means 
cpu^dataouthas been registered by the LSS block and for a 
read cyde this means the data on /ss cpu data is valid 


lss_cpu_befr 


1 


Out 


LSS bus error signal to the CPU. 


l88.cpu_data[31 :0] 


32 


Out 


Read data bus to the CPU 


tss.cpu.debugLvatid 


1 . 


Out 


Acuve high. Indicates the presence of vaJkl debug data on 

is$_cpu_data. 


GP10 for LSS buses 




lss_apJo_do(1 .-0) 


2 


Out 


LSS bus data output 
BitO-LSSbusO 
Bit 1 - LSS bus 1 




9pioJss.di[l:0J 


2 


In 


LSS bus data input 
BitO-LSSbusO 
Bit 1 - LSS bus 1 




lss_gpJo_e[1:01 


2 


Out 


LSS bus data output enable, active high 
Bit 0 * LSS bus 0 
Bit 1 - LSS bus 1 




lss_gpio.cJk{1:0J 


2 


Out 


LSS bus dock output 
Bit 0 - LSS bus 0 
Bit 1 -LSS bus 1 




tCU rntorfacG 






lss_icujrq(l:0] 


2 


Out 


LSS interrupt requests 

Bit 0 - interrupt associated with LSS bus 0 

Bit 1 - interrupt associated with LSS bus 1 
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19.3.2 Configuration registers 

The configuration registcis in the LSS block are programmed via the CPU interface. Refer to section 1 1 4 
?coT , descnption of the protocol and timing diagrams for reading and writing registers in the 

LSS block. Note fliat since addresses in SoPEC are byte aligned and the CPU only supports 32-bit reeister 
t«ads^dt writes, the lower 2 bits of the CPU address bus are not required to decode the address space for 
the LSS block. Table 62 lists the configuration registers in the LSS block. When reading a register that is 
less than 32 bits wide zeros should be returned on the upper unused bit(s) of lss_cpuj[ata. 
The input cpu_acode signal indicates whether the current CPU access is supervisor, user, program or data. 
The configuration registers in the LSS block can only be read or written by a supervisor data access i e 
when cpu_acode equals bll. If the current access is a supervisor data access then tiie LSS respond! by 
asserting lss_cpu_rdy for a single clock cycle. 

If the current access is anything other than a supervisor data access, then the LSS generates a bus eiror by 
assertmg Iss^cpujyerr tot a single clock cycle instead oflss_cpu_nfy as shown in section 1 1.4 on page 69 
A write access will be ignored, and a read access will return zero. 



Table 62. LSS Control Registers 













vrQfiiroi roQiMi 




0x00 


Reset 


1 


0x1 


A write to this register causes a reset of the LSS. 


0x04 


LssClockHighPertod 


16 


OxOOC8 


High period of /ss_c//f expressed as a number of pclk 
cydes. Transmission over the LSS bus is at a nominal 
rate of 400kHz. corresponding to a high period of 200 
pdk (160Mhz) cycles for a 50/50 duty cyde. 


0x08 


LssClockLowPeriod 


16 


OxOOCS 


Low period of /s$_cffr expressed as a number of pc* 
cycles. Transmission over the LSS bus is at a nominal 
rate of 400kHz, corresponding to a low period of 200 
fxUk (1 eOMhz) cycles for a 50/50 duty cyde. 


LSS bus 0 regl 


8ters 


0x10 


LssOlntStatus 


3 


0x0 


LSS bus 0 interrupt status registers 

Bit 0 - command cooipleted successfully 

Bit 1 - error during processing of command, 

not -acknowledge received after transmission 

of primary id byte on LSS bus 0 
Bit 2 - error during processing of command, 

not -acknowledge received after transmission 

of data byte on LSS bus 0 
A 1 In a bit of is$0_status_sats\qm\ causes the corre- 
sponding bit in LssOtntStatus register to be set- 
All the bits in LssOfntSt3tus are deared when the 
LssOCmd register gets written to. 
(Read only register) 


0x14 


LssOCurrentState 


4 


0x0 


Gives the current state of the LSS bus 0 state 

machine. (Read only register). 

(Encoding will be specified upon state machine Imple- 

nnentatlon) 


0x18 


LssOCmd 


22 


0x00 
_0000 


Command register defining sequeru^e of events to 
perform on LSS bus 0 before interrupting CPU. 
A write to this register causes all the bits in the 
LssOjntStatus register to be deared as well as gener- 
ating a /5s0_neiv_cmd pulse. 
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Table 62. LSS Control Registers 







^^^^ 






0x1 C -0x20 


Us0filD[4:0] 


5x32 


0x0000 
_0000 


LSS Data buffer. Should be filled with transmit data 
before tiansmlt command, or read data bytes received 
after a valid read command. 


LSS bus 1 regi 


sters 


0x30 


LssilntStatus 


3 


0x0 


LSS bus 1 interrupt status registers 
Bit 0 - command completed successfully 
Bit 1 - error during processing of command, 

not -acknowledge received after transmission 

of primary id byte on LSS bus 1 
Bit 2 - error during processing of command, 

not -adcnowledge received after transmission 

of data byte on LSS bus 1 
A 1 in a bit of IssT.sfaft/s.^ef signal causes the corre- 
sponding bit in LssUntSiatus register to be set 
All the bits In Lss UntStatus are cleared when the 
LssfCmd register gets written to. 
(Read only register) 


0x34 


LsslCurrentState 


4 


0x0 


Gives the current state of the LSS bus 1 state 
machine- (Read only register) 
(Encoding will l>e specified upon state machine imple- 
mentation) 


0x38 


LsslCmd 


22 


0x00_ 
0000 


Command register defining sequence of events to 
perform on LSS bus 1 before interrupting CPU. 
A write to this register causes all the bits in the 
LsslintStatus register to be cleared as well as gener- 
ating a issl^new^cmd pulse. 


0x3C-0x4C 


Lss1Buffert4:01 


5x32 


0x0000 
^0000 


LSS Data buffer. Should be filled with transmit data 
before transmit command, or read data t>ytes received 
after a valid read command. 


Debug registers 


0x50 


LssDebugSet 


5 


0x00 


Selects register for debug output. This value is used 
as the input to the register decode logic instead of 
cpu^adr[6:2Jwhen the LSS block is not being 
accessed by the CPU, i.e. when cpu__lss^sef \s 0. 
The output iss_cpu_debug_valid '\s asserted to indi- 
cate that the data on tss_cpu_data is valid debug 
data. This data can be mutliplexed onto chip pins dur- 
ing debug mode. 



193.2,1 LSS command registers 

The LSS command registers define a sequence of events to perform on the respective LSS bus before issu- 
mg an interrupt to the CPU. TThere is a separate command register and interrupt for each LSS bus. The for- 
mat of the command is given in Table 63. The CPU writes to the command register to initiate a sequence 
of events on an LSS bus. Once the sequence of events has completed or an error has occurred, an interrupt 
is sent back to the CPU. 

Some example commands are: 

• a single START condition {Start = I . IdByteEnable = 0. RdWrEnable = 0. Stop - 0) 

• a single STOP condition {Start = 0, IdByteEnable = 0. RdWrEnable = 0, Stop = 1 ) 

• a START condition followed by transmission of the id byte {Start = 1 , IdByteEnable = I , RdWrEnable 
- 0, Stop = 0, MByte contains primary id byte) 
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0, UByteEnable 



0, RdWrEnable = 1, 
0, RdWrEnable = i. 



• a write transfer of 20 bytes from the data buflFer {Start 
RdWrSense ° 0, Stop = 0, TxRxByteCount = 20) 

• a read transfer of 8 bytes into the data buffer (Start = 0, UByteEnable 
RdWrSense = 1, ReadNack = 0, Stop - 0, TxRxByteCount = 8) 

• » "^^^ transaction of 16 bytes (Sfor/ = 1 . TdByteEnable = 1 . RdWrEnable = 1 , RdWrSense = 1 
ReadNack - I, Stop = I , MByte contains piimary id byte, TxRxByteCount = 1 6), etc. 

Q^*^ IJ"' '•'^ '° ^ received (up to a maximum of 20) on 

the LSS bus before It gcte interrupted. This allows it to insert arbitrary delays in a transfer at a byte bound- 
;ffl f^P'« CPU may want to transmit 30 bytes to a QA chip but insert a delay between the 20* 
o.^^°*- 1 °^ ^ ^ "^^8 ''J^^^ ^"ifl'er. It then writes a command to gen- 

erate a START condmon. send the primary id byte and then transmit the 20 bytes fiom the data buffer 
When mteiTupted by the LSS block to indicate successful compleHon of the command the CPU can then 
write the rMnammg 10 bytes to the data buffer. It can then wait for a defined period of time before writing 
a command to tr^msmit the 10 bytes fccm the data buffer and generate a STOP condition to terminate thi 
transaction over the LSS bus. 

An intemipt to the CPU is generated for one cycle when any bit in UsNIntStatus is set. The CPU can read 
LssNIntStatus to discover the source of the intenupt and can clear a bit in LssNIntStatus by writing a 1 to 
the corresponding bit in LssNIntStatus register. Alternatively the CPU can start a new command which 
will automatically reset all LssNIntStatus bits. 



Table 63. LSS command register description 



\W0\ 






o 


start 


When 1 , fssue a START condition on the LSS bus. 


1 


IdByteEnable 


ID byte transmit enable: 

1 - transmH byte in idByta field 

0 * Ignore byte In MByte field 


2 


RdWrEnable 


Read/write transfer enable: 

0 - Ignore settings of RdWrSense, ReadNack and TxRxByteCount 

1 - if RdWrSense is 0. then perform a write transfer of TxR^eyfoC^t/mbytea from the 

data buffer. 

if RdWrSense is 1 . then perform a read transfer of TxRxByteCount bytes into the 
data buffer. Each byte should be acknowledged and the last byte received Is 
acknowledged/not-acknowledged according to the setting of ReadNack 


3 


RdWrSense 


Read/Write sense indteator: 
0- write 
1 - read 


4 


ReadNack 


Indicates, for a read transfer, whether to an acknowledge or a not-acknowledge 
after the last byte received (indicated by TxRxByteCount^. 

0 - Issue acknowledge after last byte received 

1 - Issue not-acknowledge after last byte received. 


5 


Stop 


When 1 , issue a STOP condition on the LSS bus. 


7:6 


reserved 


Must be 0 


15:8 


IdByte 


Byte to be transmitted if IdByteEnat^le Is 1 , Bit 8 conesponds to the LSB 


20:16 : 


TxRxByteCount 


Number of bytes to be transmitted from the data buffer or the number of bytes to be 
received into the data buffer. The maximum value that should be programmed is 20 as 
the size of the data buffer is 20 bytes. 



u J ""F.v«iwit^« ixi niaiicT oiocK. wncn me oru wntcs to the LssNBuffer registers 

the data wntten is presented to the LSS master block via the IssNJmffer^wrdata bus and configuration 
registers block pulses the IssNJmffer^wen bit corresponding to the register written. For example if LssN- 
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fHff'erfZJ is written to lssN_buffer_wenP] will be pulsed. When the CPU reads the LssNBuffer reeisters 
the configuration registers block reflect the toAL6w#er_r«/atel«s back to 



1 9.3.3 LS5 master unit 



The LSS master umt js instantiated for both LSS bus 0 and LSS bus 1. It controls transactions on the LSS 
bus by means of the smte machine shown in Figure 65. which interprets the commands that are written by 
the CPU. It also contains a smgle 20 byte data buffer used for transmitting and receiving data. 

^""^I^ T "T'f ff^ *° transmitted on the LSS bus by writing to the LssNBuffer registere. It can also 
read data that the LSS master unit receives on the LSS bus by reading the same registers. The LSS master 
always transmits or receives bytes to or fix>m the data buffer in. the same order. For example a transmit 
conunand 

command, LssNBuffer [OJP:0] gets transmitted first, then LssNBuffer[0][15:8]. UsNBuf- 
fer[0][23:16]. I^sNBuffer[0][31:24J. UsNBuffer[l]P:0] and so on until TxRxByteCount mmA^ of 
bytes are transmitted. A receive command fills data to the buffer in the same order. Each new command the 
buffer start pomt is reset. 

All state machine outputs, flags and counters are cleared on reset. After a reset the state machine remains 
m the Idle state until bs_cmd_valid equals 1 . If the Start bit of the command is 0 the state machine pio- 
ceeds directly to the CheckldByteEnable state. If the Start bit is 1 it proceeds to the GenerateStart state 
and issues a START condition on the LSS bus. 

In the ChecMdByteEnable state, if the IdByteEnable bit of the command is 0 the state machine proceeds 
directiy to the CheckRdWrEnable state. If the IdByteEnable bit is 1 the state machine enters the Sendld- 
Byte state and the byte in the IdByte field of die command is transmitted on the LSS. The WaitForldAck 
state IS Aen entered. If the byte is acknowledged, die sUte machine proceeds to the CheckRdWrEnable 
state. If the byte is not-acknowledged, the state machine proceeds to the Generatelnterrupt state and issues 
an interrupt to indicate a not-acknowledge was received after transmission of die primary id byte. 
In die CheckRdWrEnable state, if the RdWrEnabU bit of die command is 0 die state machine proceeds 
direcdy to die CheckStop state, ff die RdWrEnable bit is 1 . count is loaded widi die value of die TxRxByte- 
Count field of the command and the state machine enters cither die ReceiveByte state if the RdWrSense bit 
of die command is I or die TransmiiByte state if the RdWrSense bit is 0. 

For a vmtc transaction, die state machine keeps transmitting bytes from die data buffer, decrementing 
counf after each byte transmitted, until count is 1. If all the bytes are successfiiUy transmitted die state 
machme proceeds to die CheckStop state. If die slave QA chip not-acknowledges a transmitted byte die 
state machine mdicates diis error by issuing an interrupt to die CPU and dien entering die Generatel^er- 
rupt state. 

For a read tonsaction, the state machine keeps receiving bytes into the data buffer, decrementing count 
after each byte transmiUed, until count is 1. After each byte received the LSS master must issue an 
acknowledge. After the last expected byte (i.e. when count is 1 ) the state machine checks the ReadNack bit 
ot the command to see whether it must issue an acknowledge or not-acknowledge for that byte The 
CheckStop state is then entered 

In the CheckStop state, if the Stop bit of the command is 0 the state machine proceeds directly to the Gen- 
eratelnterrupt state. If the Stop bit is 1 it proceeds to the GenerateStop state and issues a STOP condition 
on the LSS bus before proceeding to the Generatelnterrupt state. In both cases an intermpt is issued to 
mdicate successftil completion of the command. 

The state machine then enters the Idle state to await the next command. 

The CPU may abort the cuircnt transfer at any time by perfonning a write to the Reset register of the LSS 
block. 
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19.3.3. 1 START and STOP generation 



START and STOP conditions, which signal the beginning and end of data transmission, occur when the 
LSS master generates a falling and rising edge respecHvely on the data while the clock is high. 
In the GenerateStart state. lss_gpio_clk is held high with Iss^iojs remaining deasseited (so the data line 
is pulled high externally) for LssClockHighPenod pclk cycles. Then Iss^io e is asserted and 

^^^'f-f'' P"^^''^ ^^"^ ^ ^ ^ creating a falling edge) with Us_£pio elk remain- 

mg high for another lAs-C/ocA/ZigA/Vrio^/pc/it cycles, 

In the GenerateStop state, both Iss^io^dk and lss_gpioJ[o are pulled low foUowed by the assertion of 
lss^io_e to drive a 0 while the clock is low. After LssClockLowPeriod pclk cycles. Iss^io^dk is set 
high. After a further LssClockHighPenod pclk cycles, lss^io_e is deasserted to release the data bus and 
create a nsing edge on the data bus during the high period of the clock. 



19.3.3.2 dock puise generation 

The LSS master holds lss_^io:.clk high while the LSS bus is inactive. A clock pulse is generated for each 
bit transmitted or received over the LSS bus. It is generated by first holding Iss ^io^clk low for LssClock- 
LowPeriod pclk cycles, and then high for LssClockHighPenod pclk cycles. 

19.3.3.3 Data reception 

The input data, gpio^lss^di, is first synchronised to ih^pclk domain by means of two flip-flops clocked by 
pclk The LSS master generates a clock pulse for each bit received. The output Iss _gpio^e is deasserted on 
the falUng edge of Iss^io^clk to release the data bus. The value on the synchronised gpiojssjli is sam- 
pled on the nsmg edge of Iss^io^clk (the data should be averaged over a further 3 stage register to avoid 
possible ghtch detection). The data is only considered as a vaUd bit at the next felling edge oflssmio elk 
provided a START or STOP is not generated in the meantime. --sf - 

In the ReceiveByte state, the state machine generates 8 clock pulses. On each rising edge of lss_gpio_clk 
?^r^^f ^^'^^^'^ gpw_£rj^.fi is sampled. The first bit sampled is UsNBuffer[0] [7]. the second LssNBuf- 
fer[0][6], etc to UsNBuffer[0][0]. For each byte received the state machine either sends an NAK or an 
ACK depending on the command configuration and the number of bytes received. 

\xLt)x^ SendNack state the state machine generates a single clock pulse. Iss^io^e is deasserted and the 
LSS data line is pulled high externally to issue a not-acknowledge. 

In the SendAck state the state machine generates a single clock pulse. Iss^io^e is asserted and a 0 driven 
on Iss^io^do after Iss^io^clk falling edge to issue an acknowledge. 

19.3.3.4 Data transmission 

The LSS master generates a clock pulse for each bit transmitted Data is output on the LSS bus on the fall- 
mg edge of lss_gpio_c Ik 

When the LSS master drives a logical zero on the bus it will assert Iss _gpio_e and drive a 0 on lssepio_do 
after lss_gpio^clk falling edge. Iss^io^e will remain asserted and Iss^iojdo will remam low until the 
next lss_clk falling edge. 

When the LSS master drives a logical one lss_^io_e should be deasserted at Iss ^io_jclk falling edge and 
remain deasserted at least until the next lss^io_clk faUing edge. This is because the LSS bus will be 
externally pulled up to logical one via a pull-up resistor. 

In the Sendldbyte state, the state machine generates 8 clock pulses to transmit the byte in the MByte field 
of the current valid command. On each falling edge of Iss^io^clk a bit is driven on the data bus as out- 

l'?^'^ ""^^^^^ ^^"'"^^ ^^^^ /rf5yte/7/ is driven on the data bus, on the second falling edge 

/£/£rKre/<$/ is driven out. etc. * ® 
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In the TransmtByte^t, the state machine generates 8 clock pulses to transmit the byte at the output of 
the tt'ansmit F IFO. On each falling edge oflss^io^dk a bit is driven on the data bus as outlined above 
rm?r^ ^ '"^ /^V\»«#er/0//7/ is driven on the data bus, on the second falling edge UsNBuf^ 
/erfOJfSJ IS dnven out, etc on to LssNBufferfOJfyj bits. 

In the WaitForAck state, the state machine generates a single clock pulse. On the rising edge of 
Iss^io^clk Uie synchronized gpiojss^di is sampled. A 1 indicates an acknowledge and ack detect is 
pulsed, a 0 mdicates a not-acknowledge and nack_detect is pulsed. 



f 9.3,3.5 Data rate control 



The CPU can control the data rate by setting the clock period of the LSS bus clock by programming appro- 
pnate values in LssClockHighPeriod ^nd LssClockLowPeriod. The default setting for bo^ rcgistefs is 200 
{pcik cycles) which corresponds to transmission rate of 400kHz on the LSS bus. 
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state machine outputs. Iss icu Jrq and 
l-ssStatusSet are zero unless otherwise 
indicated. 




Figure 65. LSS master state machine 
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20 DRAM Interface Unit (DIU) 



20.1 



Overview 



Figure 66 shows how the DIU provides the interface between the on-chip 20 Mbit embedded DRAM and 
the rest of SoPEC. In addition to outlining the ftinctionality of the DIU, this chapter provides a top-level 
overview of the memoiy storage and access patterns of SoPEC and the buffering required in the various 
SoPEC blocks to support those access requirements. 

The main functionality of the DIU is to arbitrate between requests for access to the embedded DRAM and 
provide read or write accesses to the requesters. The DIU must also implement the initialisation sequence 
and refresh logic for the embedded DRAM. 

The arbitration mechanism is a hierarchical timeslot mechanism providing guaranteed bandwidth and 
latency to each DIU requester, with imused slots re-allocated to provide best effort accesses. The arbitra- 
tion scheme is fully programmable. 

The interface between the DIU and the SoPEC requesters is similar to the interface on PECl i.e. separate 
control, read data and write data busses. 

The embedded DRAM is used principally to store: 

• CPU program code and data 

• PEP (re)progranuziing commands. 

• Compressed pages containing contone, bi-level and raw tag data and header information. 

• Decompressed contone and bi-level data. 

• Dotline store during a print. 

• Print setup infoimation such as tag format structures, dither matrices and dead nozzle information. 
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Figure 66. SoPEC System Top Level partition 
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20.2 IBM Cu-11 Embedded DRAM 

20.2.1 Single bank 

SoPEC will use the 1,5 V core voltage option in IBM's 0.13 class Cu-1 1 process. 
. The random read/write cycle time and the refresh cycle time is 3 cycles at 160 MHz [16]. An open page 
access will complete in I cycle if the page mode select signal is clocked at 320 MHz or 2 cycles if the page 
mode select signal is clocked every 160 MHz cycle. The page mode select signal will be clocked at 320 
MHz in SoPEC. The DRAM word size is 256 bits. 

Most SoPEC requesters will make single 256 bit DRAM accesses (see Section 20.4). These accesses will 
take 3 cycles as they are random accesses i.e. they will most Ukely be to a different memory row than the 
previous access. 

The entire 20 Mbit DRAM will be implemented as a single memory bank. In Cu-1 1, the maximum single 
instance size is 16 Mbit. The first 1 Mbit tile of each instance contains an area overhead so the cheapest 
solution in terms of area is to have only 2 instances. 16 Mbit and 4Mbit instances would together consume 
an ^ea of 14.63 mm as would 2 times 10 Mbit instances. 4 times 5 Mbit instances would require 17.2 

The instance size will determine the frequency of refresh. Each refresh requires 3 clock cycles. In Cu-1 1 
each row consists of 8 columns of 256-bit words. This means that 16 Mbit requires 8192 rows. A complete 
DRAM refresh is required every 3.2 ms. This would mean a row would have to be refreshed every 62 
cycles. Two times 10 Mbit instances would require a refresh every 100 clock cycles, if the instances are 
refreshed in parallel. Having 4 times 5 Mbit instances means a refresh is required only every 200 cycles. 
The SoPEC DRAM will be constructed as two 10 Mbit instances implemented as a single memory bank. 
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20.3 SoPEC Memory Usage Requirements 

The memoiy usage requirements for the embedded DRAM are shown in Table 64. 



Table 64. Memory Usage 


Requirements 








Compressed page store 


2048 Kbytes 


Compressed data page store for Bl-level 
and contone data 


Decompressed Contone 
Store 


108Kt>yte 


13824 lines with scale tactor 6 = 2304 pixels, 
store 12 lines. 4 colors = 108 kB 
13824 lines with scale (actor S = 2765 pixels, 
Store 12 lines, 4 colors = 130 k6 


Spot tine store 


5.1 Kt>yte 


1 3824 dotfi/linA an ^ fin^ <^ i ieR 


Tag Format Structure 


55 Kbyte (384 dot line tags O 
1600 dpi) 

1 2 Ktiyte (2.5 mm tags Q 800 
dpi) 


55 kB in far 384 dot tnne 

2.5 mm tags (1/10th inch) O 1600 dpi require 
160 dot lines = 160/384 x55 or 23 kB 
2.5 mm tags O 800 dpi require 80/384 x55 s 
12 kS 


OHher Matrix store 


4 Kbytes 


64x64 dither nwitrix Is 4 kB 
128x128 dither matrix Is 16 kB 
256x256 dither matrix is 64 kB 


DNC Dead Nozzle Tabte 


1.4 Kbytes 


Delta encoded. (1 0 bit delta position + 6 dead 
nozzle mask) x% Dnozzle 
5% dead nozzles requires (1046)x 692 Dnoz- 
zles- 1.4 Kbytes 


Dot-nne store 


319Kbytes 


Assume each color row is separated by 5 dot 
lines on the print head 
The dot line store will be Oh&f^lO... 50455 = 
330 half dot lines + 48 extra half dot Hnes (4 
per dot row) = 378 half dot lines =» 3igKbytes 


PCU Program code 


8 Kbytes 


1024 commands of 64 bits s 8 kB 


CPU 


64 Kbytes 


Program code and data 


TOTAL 


2570 Kbytes (1 2 Kbyte TPS 
storage) 

2613 Kbytes (55 Kbyte TPS) 





Note: 



Total storage of 2570 Kbytes will be reduced to 2560 Kbytes to align to 20 Mbit DRAM. 
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20.4 SoPEC Memory Access Patterns 



Table 65 shows a summaiy of the blocks on SoPEC requiring access to the embedded DRAM and their 
individual memory access patterns. Most blocks will access the DRAM in single 256-btt accesses AU 
accesses must be padded to 256-bits except for 64-bit CDU write accesses and CPU write accesses. Bits 
which should not be written are masked using the individual DRAM bit write inputs or byte write inputs 
dependmg on the foundry. Using single 256.bit accesses means that the buffering required in the SoPEC 
DRAM requesters wiU be minimized. 



Table 65. Memory access patterns of SoPEC DRAM Requesters 









CPU 


R 


Single 256-bit reads. 




W 


Single 32-bit. 1 6-fait or 8-bit writes. 




W 


Single 256-bft writes. 




R 


Single 256-blt reads of the compressed oontone data. 




W 


Each CDU access is a write to 4 consecutive DRAM words In the same row" 
but only 64 bits of each word are written with the remaining bits write 
masked. 

The access time for this 4 word page mode burst is 3 + 1 + 1 +1 = 6 cycles 
if the page mode seJecl signal is clocked at 320 MHz. 


CFU 


R 


Stngte 256 bit reads. 


LBO 


R 


Single 256 bit reads. 




H 


Separate single 256 bit reads for previous and cun-ent line but sharing the 
same OIU interface 




w 


Single 256 bit writes. 


TECTD) 


R 


Single 256 bit reads. Each read returns 2 times 1 28 bit tags. 


TECTFS) 


R 


Single 256 bit reads. TFS is 1 36 bytes. This means there is unused data in 
the fifth 256 bit read. A total of 5 reads is required. 


HCU 


R 


Single 256 bit reads. 1 28 x 128 dither matrix requires 4 reads per line with 
double buffering. 256 x 256 dither matrix requires 8 reads at the end of the 
line with single buffering. 

Dither matrices have start address, end address and line advance Incre- 
ment 


DNC 


R 


Single 256 bit dead nozzle table reads. Each dead nozzle table read con- 
tains 16 dead-nozzle taWes entries each of 10 delta bits plus 6 dead nozzle 
mask bits. 


owu 


W 


Single 256 bit writes since enable/disable DRAM access per color plane. 




R 


Single 256 bit reads since enable/disable DRAM access per color plane. 


PCU * 


R 


Single 256 bit reads. Each PCU command is 64 bits so each 256 bit word 
can contain 4 PCU commands. 

PCU reads from DRAM used for reprogramming PEP shoukJ be executed 

with minimum latency. 

if this occurs between pages then there will be free bandwidth as most of 
the other SoPEC Units will not be requesting from ORAM. If this occurs 
between bands then the LDB, CDU and TE bandwidth will be free. So the 
PCU should have a high priority to access to any spare bandwidth. 


Refresh 




Single refresh. 
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20.5 Buffering Required in SoPEC DRAM Requesters 

If each DIU access is a single 256-bit access then we need to provide a 256-bit double buffer in the DRAM 
requester. If the DRAM requester has a 64-bit interface then this can be implemented as an 8 x 64-bit 



Table 66. Buffer sizes In SoPEC DRAM requesters 



DRAM 
Requester 


Direction 


Access patterns 


Buffering required in 
block 


CPU 


R 


Sinnip PS6-hit rAAH« 
^M*y'*» «-«jw"uii locias. 


Cache. 


W 


Single 32-blt writes but allowing IS-bit or byte 
addressable writes. 


None. 


SCB 


W 


Single 256-brt writes. 


Double 
2564>}t buffer. 


CDU 


R 


Single 256-bit reads of the oompressed oontone 
data. 


Double 256-bit buffer. 


W 


cacn K^uu access is a write to 4 consecuuve DRAM 
words in the same row but only 64 bits of each word 
are written with the remaining bits write masked. 


Double half JPEG block 
buffer. 


CFU 


R 


Single 256 bit reads. 


Double 256-bit buffer. 


LBD 




OulQlc Oil rcaCS. 


Double 256-btt buffer. 


SFU 


R 


Separate single 256 bit reads for previous and cur- 
rent line but sharing the same DIU interface 


Double 2S&^ buffer for 
each read channel. 


W 


Single 256 bit writes. 


Double 256-bit buffer. 


TECTD) 


R 


Singte 256 bit reads. 


uouoie Zbo-oit Dufter. 


TE(TFS) 


R 


Single 256 bit reads. TPS la 136 bytes. This means 
there Is unused data In the fifth 256 bit read. A total 
of 5 reads Is required. 


Double line-buffer for 1 36 
bytes implemented In TE. 


HCU 


R 


Single 256 bit reads. 128 x 128 dither matrix 
requires 4 reads per line with double buffering. 256 x 
256 dither matrix requires 6 reads at the end of the 
line with single buffering. 


Configurabfe between dou- 
ble 128 byte buffer and 
single 256 byte txjffer. 


DNC 


R 


Single 256 bit reads 


Double 256-bit buffer. 
Deeper buffering could be 
specified to cope with local 
clusters of dead nozzles. 


DWU 


W 


Single 256 bit writes per enabled odd/even color 
plane. 


Double 256-bIt buffer per 
color plane. 


LLU 


R 


Single 256 bit reads per enabled odd/even color 
plane. 


Double 256-bit tjuffer per 
color plane. 


PCU 


R 


Single 256 bit reads. Each PCU command is 64 bits 
so each 256 bit DRAM read can contain 4 PCU com- 
numds. Requested command ts read from DRAM 
together with the next 3 contiguous 64-bits which are 
cached to avoid unnecessary DRAM reads. 


Single 256-bit buffer. 


Refresh 




Single refresh. 


None. 
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20.6 SoPEC DIU Bandwidth Requirements 

Tabte 67: SoPEC DIU Bandwidth Requirements 





W 



128(SF = 4).288(SF. 
6), 1 :1 compression^ 



For Individual accesses: 
16cydes (SF = 4), 36 
cydes (SF = 6), i^cydes 
(SF=n). 

Will be implemented as a 
page mode burst of 4 
accesses every 64 cydes 
(SF = 4),144 (SF =6). 
4*n^(SF an) cydes^ 

32(SF = 4),48(SF = 6)'' 



32/n2 (SR=n), 
0,9 (SF = 6). 
2{SF=4) 
(1:1 compression) 



64/n2 (SF=n). 
1.8(SF = 6), 
4(SF = 4) 



32/1 0-n^ (SF=n). 
0.09 (SF = 6). 
0.2 (SF=.4) 
(10:1 connpresslon)^ 



32/n2 (SF=n). 
0.9 (SF = 6). 
2(SFs4)^ 



1 (SF==6) 
2(SF=4) 



2 (SFr=6) 
4 (SFa4> 



CFU 



LBD 



32/n (SFsn), 
5.4 (SF = 6), 
8 (SF = 4) 



32/n (SF=n). 
5.4 (SF«6), 
8 (SF « 4) 



5.5 (SF=6) 
8(SF=4) 



256 (1:1 compression)^ 
128'° 



1 (1:1 compression) 



0.1 (10:1 compression)^ 



SFU 



w 



256' 



TE(TD) 



TECTFS) 
HCU 



DNC 



DWU 



R 



LLU 



PCU 



Refresh 



TOTAL 



252*2 



5 reads per line'^ 



1.02 



1.02 



0.093 



4 reads per line for 1 28 x 
128 dither matrix^^ 



0.093 



1.25 
0.25 



0.074 



0.074 



0.25 



106 (5% dead-nozzles 
10-bit delta encoded)''* 



2.4 (dump of dead 
nozzles) 



6 writes every 256^* 



0.8 (equally spaced 
dead nozzles) 



2.5 



8 reads every 256^^ 



256'« 



100*» 



2.56 



2.56 



2.75 



SF 8 6: 34 
SF = 4: 39.5 

excluding CPU 



SF = 6;27.5 
SFa4: 31.2 
exduding CPU 



SFs6:35 
exduding CPU. 
SFs4: 40.5 
exduding CPU 



Notes: 

1: The number of allocated timcslots is based on 64 timeslots each of I bit/cycle but broken down 
0.25 bit/cycle. 

2: 50 Mbit/s is 0.328 bits/cycle or 256 bits every 780 cycles. 

3: At 1 : 1 compression CDU must read a 4 color pixel (32 bits) every SF-^ cycles. 



to a granularity of 
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4: At lOrl'avcragc compression CDU must read a 4 color pixel (32 bits) every IO*SF^ cycles. 
5: 4 color pixel (32 bits) is required, on average, by the CFU cveiy SF^ (scale factor) cycles. 

The time available to vmtc the data is a function of the size of the buffer in DRAM. 1.5 buffering means 4 color pixel 
(32 bits) must be written every SF^ / 2 (scale fector) cycles. Therefore, at a scale factor of SF. 64 bits are required 
every SF^ cycles. 

Since 64 valid bits are written per 256-bit write (Figure 104 on page 282) then the DRAM is accessed every SF^ 
cycles i.e. at SF4 an access every 16 cycles, at SF6 an access every 36 cycles. 

If a page mode burst of 4 accesses is used then each access takes (3 + 1 + 1+1) equals 6 cycles. This means at SF, a set 
of 4 back-tO'back accesses must occur every 4*SF^ cycles. This assumes the page mode select signal is clocked at 320 
MHz, CDU timcslots therefore take 6 cycles. 

For scale £ictors lower than 4 double buffering will be used. 

6: The average bandwidth 1/2 the peak bandwidth in the case of 1 .5 buffering. 

7: 4 color pixel (32 bits) read by CFU every SF cycles. At SF4, 32 bits is required every 4 cycles or 256 bits every 32 

cycles. At SF6, 32bit5 every 6 cycles or 256 bits every 48 cycles. 

8: At 1 : 1 compression require 1 bit/cycle or 256 bits every 256 ^cles. 

9: The average bandwidth required at 10:1 compression is 0.1 bits/cycle. 

10: Two separate reads of 1 bit/cycle. 

11 : Write at 1 bit/cycle. 

12: Each tag can be consumed in at most 126 dot cycles and requires 128 bits. This is a maximum rate of 256 bits 
every 252 cycles. 

13: 17 X 64 bit reads per line in PECl is 5 x 256 bit reads per line in SoPEC, Double*line buffered storage. 
14: 128 bytes read per line is 4 x 256 bit reads per line. Double-line buffered storage. 

15: 5% dead nozzles 10-bit delta encoded stored with 6-bit dead nozzle mask requires 0.8 bits/cycle read access or a 
256-bit access every 320 cycles. This assumes ±e dead nozzles are evenly spaced out. In practice dead nozzles are 
likely to be clumped. Peak bandvridth is estimated as 3 times average bandwidth. 
16: 6 bits/cycle requires 6 x 256 bit writes every 256 cycles. 

17: 6 bits/160 MHz SoPEC cycle average but will peak at 2 x 6 bits per 106 MHz print head cycle or 8 bits/ SoPEC 
cycle. The PHI can equalise the DRAM access rate over the line so that the peak rate equals the average rate of 8 bits/ 



18: Assume one 256 read per 256 cycles is sufficient i.e. maximum latency of 256 cycles per access is allowable. 

19: As an example assume refresh must occur every 3.2 ms. Refresh occurs row at a time over 5 120 rows of 2 parallel 

10 Mbit instances. Each refresh takes 3 cycles. This is equivalent to a txmeslot every 100 cycles. 



cycle. 
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20.7 DIU BUS TOPOLOGY 

20.7.1 Basic topology 

Table 68. SoPEC OIU Requestere 



J3 



CPU 


CPU 


Refresh 


CDU 


SCB 




CFU 


CDU 




LBD 


SFU 




SFU 


DWU 




TE(TD) 






TE(TFS) 






HCU 






DNC 






LLU 






PCU 







Table 68 shows the DIU requesters in SoPEC. There are 1 1 read requesters and 5 write requesters in 
SoPEC as compared with 8 read requesters and 4 write requesters in FECI. Refresh is an additional 
requester. 

In PECl , the interface between the DIU and the DIU requesters had the following main features: 

• separate control and address signals per DIU requester multiplexed in the DIU accordmg to the arbitra- 
tion scheme, 

• separate 64-bit write data bus for each DRAM write requester multiplexed in the DIU, 

• common 64-bit read bus from the DIU with separate enables to each DIU read requester. 

Timing closure for tbis bussing scheme was straight-forward in PEC 1 . This suggests that a similar scheme 
will also achieve timing closure in SoPEC. SoPEC has 5 more DRAM requesters but it will be in a 0.13 
um process with more metal layers and SbPEC will nm at approximately the same speed as FECI. 
Using 256-bit busses would match the data width of the embedded DRAM but such large busses may 
result in an increase in size of the DIU and the entire SoPEC chip. The SoPEC requestors would require 
double 256-bit wide buffers to match the 256-bit busses. These buffers, which must be implemented in 
flip-flops, are less area efficient than 8-deep 64-bit wide register arrays which can be used with 64-bit bus- 
ses. SoPEC will therefore use 64-bit data busses. Use of 256-bit busses would however simplify the DIU 
implementation as local buffering of 256-bit DRAM data would not be required within the DIU. 

20.7.1.1 CPU DRAM access 

The CPU is the only DIU requestor for which access latency is critical. All DIU write requesters transfer 
write data to the DIU using separate point-to-point busses. The CPU will use the cpu_dataout[3 1 :0] bus. 
CPU reads will not be over the shared 64-bit read bus. Instead, CPU reads will use a separate 256-bit read 
bus. 



20.7^ Making more efficient use of DRAIVI bandwidth 

The embedded DRAM is 256-bits wide. The 4 cycles it takes to transfer the 256-bits over the 64-bit data 
busses of SoPEC means that effectively each access will be at least 4 cycles long. It takes only 3 cycles to 
actually do a 256-bit random DRAM access in the case of IBM DRAM, 
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20.7.2,1 Common read bus 



If we have a common read data bus, as in FECI, then if we arc doing back to back read accesses the next 
DRAM read cannot start until the read data bus is free. So each DRAM read access can occur only every 4 
cycles. This is shown in Figure 67 with the actual DRAM access taking 3 cycles leaving 1 unused cycle 
per access. 



pclk 

diu_data(63:0] 

rreq(n+l) 

rreq(n+2) " 

rreq(n+3) ' 
rack(n+l) 

rack(n+2) 

rack(n+3) 



access n 



access n+l 



access n+2 



access 



unused 
cycle 



unused 
cycle 



access 



unused 
cycle 



J — L 



Figure 67. Shared read bus with 3 cycle random DRAM read accesses 



20.7.2.2 Interleaving CPU and non-CPU read accesses 

The CPU has a separate 256-bit read bus. All other read accesses are 2S6-bit accesses are over a shared 64- 
bit read bus. Interleaving CPU and non-CPU read accesses means the effective duration of an interieaived 
access timeslot is the DRAM access time (3 cycles) rather than 4 cycles. Interleaving is achieved by order- 
ing the DIU arbitration slot allocation appropriately. 

Figure 68 shows interleaved CPU and non-CPU read accesses. 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 204 




I I I I 

Figure 68* Interleaving CPU and non-CPU read accesses 



20.7.2.3 interleaving read ana write accesses 

Having separate write data busses means write accesses can be interleaved with each other and with read 
accesses. So now the effective duration of an interleaved access timeslot is the DRAM access time (3 
cycles) rather than 4 cycles. Interleaving is achieved by ordering the DIU arbitration slot allocation ^pro- 
priately. 

Figure 69 shows interleaved read and write accesses. Figure 70 shows interleaved write accesses. 




Figure 69. Interieavmg read and write accesses with 3 cycle random DRAM accesses 



Write data still takes 4 cycles to transmit over 64-bit busses so 256-bit buffers are required in the DIU to 
gather the write data from the requesters. 
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Figure 70. Interleaving write accesses with 3 cycle random DRAM accesses 



20.7.3 Basnnridths y 



Table 69. SoPEC DIU Requesters Data Bus Width 











CPU 


256 (separate) 


CPU 


32 (OPEN ISSUE) 


CDU 


64 (shared) 


SCB 


64 


CFU 


64 (shared) 


CDU 


64 


LBO 


64 (shared) 


SFU 


64 


SFU 


64 (shared) 


DWU 


64 


TE(TD) 


64 (shared) 






TECTFS) 


64 (shared) 






HCU 


64 (shared) 






DNC 


64 (shared) 






LLU 


64 (shared) 






PCU 


64 (shared) 







20.7.4 Conclusions 

Reads and writes can be interleaved with a separate 256-bit read bus for the CPU for minimum latency 
DIU access. Interleaving can be performed by inserting write accesses or CPU accesses between shared 
read bus accesses. The interleaving is achieved by ordering the DIU arbitration slot allocation appropri- 
ately. 
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20.8 SOPEC DRAM ADDRESSING SCHEME 

The embedded DRAM is composed of 256-bit words. However the CPU-subsystem may need to write 
individual bytes of DRAM. Therefore it was decided to make the DIU byte addressable. 22 bits are 
required to byte address 20 Mbit of DRAM. 

Most blocks read or write 256 bit words of DRAM. Therefore only the top 17 bits i.e. bits 21 to 5 are 
required to address 256-bit word aligned locations. 

The exceptions are 

• CDU whidi can write 64-bits so only the top 19 address bits i.e. bits 21-3 arc required 

• CPU writes can be 8, 1 6 or 32-bits. The cpu_diujmnaskfJ:OJ pins indicate whether to vmte 8, 1 6 or 32 
bits. 

All DIU accesses must be within the same 256-bit aligned DRAM word 
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20,9 DIU Protocols 



The DIU protocols are 

• pipelined i.c the following transaction is initiated while the previous transfer is in progress. 

• split transaction i.c. the transaction is split into independent address and data transfers. 

20,9.1 Read Protocol except CPU 

The SoPEC read requestors, except for the CPU, perform single 256-bit read accesses with the read data 
being transferred from the DIU in 4 consecutive cycles over a shared 64-bit read bus. diu dataf63 01 The 
read address <unit>_diu^adrpl:5J is 256-bit aligned. 

The read protocol is: 

• <unit>_diu_jTeq is asserted along with a valid <unU>_diu_radri2 1:5J. 

• The DIU acknowledges the request with diu__<unit>^rack The request should be deasserted. The min- 
unum number of cycles between <unit>_diu^rreq being asserted and the DIU generating an 
diu^<unit>^rack strobe is 2 cycles (1 cycle to register the request, 1 cycle to perform the aibitration . 
sec Section 20. 13.6). 

• The read data is returned on diu_da(a[63:0] and its validity is indicated by diu_<unU>^aiid. 

• diu^<unit>_rvalid pulses have been received then if there is a further request 
<umt>_diu_rreq should be asserted again. diu_<unit>_rvalid will be always be asserted by the DIU 
for four consecrative cycles. The first diu^<unit>^alid pulse will occur 3 cycles after 
diu_<unit>^ack (I cycle to transfer the address to the DRAM. 2 cycles for the read data to be 
returned from the DRAM). 



pclk 

<unit>_diu_nreq 
diu__<unit>_rack 



J 



<unit>_diu_radr[2 1 :5] | | 
diu_<uml>_rvalid 
diu_data[63:0] | 



Figure 71. Read protocol for a SoPEC Unit making a single 256-blt access 



20.9.2 Read Protocol for CPU 



The CPU performs single 256-bit read accesses with the read data being transferred from the DIU over a 
dedicated 256.bit read bus for DRAM data, dram j:pu_data[25 5:0], The read address cpu odrRLS] is 
256-bit aligned. " 

The CPU DIU read protocol is: 

• cpu_fiiu_rr€q is asserted along with a valid cpuj2dr[21:5]. 
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The DIU acknowledges the request with diu_cpu_rack The request should be deasserted. The mini- 
mum nmnber of eye es between cpu,diu_rreq being asserted and the DIU generating a cpu diu rack 
Z u.6). ' *° P*^"™ *^ arbitration - se^ Se~ction 

The read data is returned on dram_cpu_data[2SS:0] and its validity is indicated by diu_cpu_rvalid. 
When the diu_cpu_rmUdpuise has been received then if there is a further request cpu diu_rreq should 
be asserted agam Tie rf,«_c/«_rv«K^ will occur 3 cycles after rack (1 cycte to tiamfer the 
address to the DRAM. 2 cycles for Ae read data to be returned from the DRAM) 




cpu_adr[21:3] | | 

diu_cpu_rvalid 

drain_q>u_data(255:0] I" 



Figure 72. Read protocol for a CPU making a single 256-blt access 



20.9.3 Write Protocol except CPU and CDU 

The SoPEC write requestors, except for die CPU and CDU. perform single 256-bit write accesses with the 
wnte data bemg transferred to the DIU in 4 consecrative cycles, over dedicated point-to-point 64-bit write 
data busses. The wnte address <unit>_diu_M>adr[21:5] is 256-bit aligned. 
The write protocol is: 

• <unit>_diu_wreq is asserted along with a valid <unit>_diu_wadrf2 1:5]. 

• The DIU acknowledges die request with diu_<unit>_wack. The request should be deasserted. The 
minunum number of cycles between <unit>_diu_wreq being asserted and the DIU generating an 
rf/M <«m7> M;ac* strobe is 2 cycles (1 cycle to register the request. 1 cycle to perform the ari>itration - 
see bection 20.13.6). 

• In the dock cycles following wack the SoPEC Unit outputs the <unit^_diujdata[63:0], asserting 
<umt>__diu^wvaiid. Write data should be output as soon as possible after leceiving the wacL Access- 
ing registers, register armys or SRAMs may incur different delays. The first <unit>_diu^wvalid pulse 
can occur in the clock cycle after diu_<uni(>_wack. In the case of register airay or SRAM access, the 
first <unit>_diu_wvaiid pulse will occur 2 clock cycles after diu_<umt>_wack. 

• Once all the write data has been output then if there is a further request <unU> diu wreg should be 
asserted again. _ - 

A timeout mechanism will be implemented to ensure that the DIU will not lock-up if four 
<unit>_diu_wvalid pulses are not provided. 
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pclk 

<unit>_diu_wreq 



<umt>_diu.wadr[21:5] [ 
diu_<unit>_wack 



<uiut>_diu_data[63:0] [; 
<unit>_diu_wvalid 



J 



1 



Figure 73. Write Protocol shown for a SoPEC Unit making a single 2S6-bit access 



20,9.4 CPU Write Protocol 



The CPU perfonns single write which can be 8. 16 or 32-bits with the write data being transferred to the 
DIU over the cpu^dataautfSI. OJ bus. The write address cpu_adrf2J:0J is byte aligned 
The CPU write protocol is: 

• cpu^diu^wreq is asserted along with a valid cpu^adr[21:0J and a write mask cpu^diu^wmaskn :0] to 
indicate whether an 8, 1 6 or 32-bit access is required. 

• The DIU acknowledges the request with diujopuj^ack. The request should be deasserted. The mini- 
mum number of cycles between cpu_diu_wreq being asserted and the DIU generating an 
diu^cpu_M;ack strobe is 2 cycles (1 cycle to register the request. I cycle to perfom the arbitration - see 
Section 20.13.6). 

• In the clock cycle following diu_cpu^wack the CPU outputs the cpu_dataout[31:0J, asserting 
cpu^dm wvalid Write data should be output as soon as possible after receiving the diulcpu^wack. 
The earhest the cpu^diu^wvalid pulse can occur is in the first clock cycle after diu^cpu^wach 

• Once the write data has been output then if there is a further request cpu^diu_wreq should be asserted 
again. 
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J3 



pcLk 

cpu_diu_wreq 



q)u_adr[2I:0] ^ 



q3u_diu_winask[l:0] L 
diu_cpu_wack 
q)u.dataout(31:0] F 



cpu_diu_wvalid 



EZE 



J — L 



Figure 74. Write Protocol shown for a CPU maidng an 8, 16 or 32-bit access 



20.9.5 COU Write Protocol 



;nie CPU performs four 64-bit writes to 4 contiguous 256-bit DRAM addresses with the fast address spec- 
ified by cdu_diu_wadr[2l:3]. The write address cdu_diu_wadr[2I:3J is 64-bit aligned 
The write protocol is: 

• cdu_diu_wdata is asserted along with a valid cdu_diu_wadr[21:3]. 

• The DIU acknowledges the request with diu_cdu_wack. The request should be deasserted. The mini- 
mum number of cycles between cdu_diu_wreq being asserted and the DIU generating an 

dtu_cdu wack^\>t " ^ "^y^'* *° "S^''' I cy^te to P«rfonn the arbitration - see 

section 20.13.6). 

• In the clock cycles following wack the CDU outputs the cdu_diu_data[63:0], together with asserted 
cdu^diu_wvalid. Wnte data should be output as soon as possible after receiving the wacJt. Accessing 
registers, register arrays or SRAMs may incur different delays. The first cdu^diu^wvalid pulse can 
occur m the clock cycle after diu^cdu^wack. In the case of register array or SRAM access, the first 
cdu_diu_wvalid pulse wiU occur 2 clock cycles after diu_fidu_wack. 

" ^^ed a?adn!^'^ '''^''^ ^^"^ ^^""^ ^ ^ ^^"^ ^'^'''^^ crfw.i/iw.ivre^ should be 

A timeout mechanism will be implemented to ensure that the DIU will not lock-up if four cpu diu wvalid 
pulses are not provided. ^ ~ - 
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pclk 

cdu_diu_wreq 




cdu_diu_wadr[22:3] |: \ . ^ 



diu_cdu_wack [ [ 

cdu_diu_data(63:0] | . I 1 I 2 | 3 1 4 | 

cdu_diu_wvalid I I 



Figure 75- Write Protocol shown for CDU making four contiguous 64-bit accesses 
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20.10 DIU ARBITRATION MECHANISM 

The DIU wiU arbitrate access to the embedded DRAM. The arbitration scheme is outlined in the next sec- 
tions. 

20.10.1 Timeslot based arbitration scheme 

Table 67 summarised the bandwidth requirements of the SoPEC requestors to DRAM. If we allocate the 
DIU requestors in temis of peak bandwidth then we require 36 bits/cycle (at SF =6) and 42,5 bits/cycle (at 
SF = 4) for all the requestors except the CPU. 

A timeslot scheme is defined with 64 main timeslots. The number of used main timeslots is programmable 
between 0 and 64. 

Since DRAM read requestors, except for the CPU, are connected to the DIU via a 64-bit data bus each 
256-bit DRAM access requires 4 pclk cycles to transfer the read data over the shared read bus. The 
timeslot rotation period for 64 timeslots each of 4 pclk cycles is 256 pclk cycles or 1 .6 |is, assuming pclk is 
160 MHz. Each timeslot represents a 256-bit access every 256 pclk cycles or 1 bit/cyclc. This is the granu- 
larity of the majority of DIU requestors bandwidth requirements in Table 67. 

The SoPEC DIU requesters can be represented using 5 bits (Table on page 229). Using 64 timeslots 
means that to allocate each timeslot to a requester a total of 64 times 5 configuration registers is required 
for the 64 main timeslots. 

Timeslot based arbitration works by having a pointer point to the current timeslot. When re-arbitration. is 
signaled the arbitration pointer will advance to the next timeslot. If die SoPEC Unit assigned to the current 
timeslot is not requesting then the unused timeslot arbitration mechanism outlined in Section 20.10.4 is 
used to select the arbitration wiimer. 

The timeslot pointer advances when the DIU issues the next command to the DRAM. Each timeslot there- 
fore denotes a single access. The duration of the timeslot depends on the access. 

If the SoPEC Unit pointed to by the current timeslot pointer is not requesting then the slot will be allocated 
according to the mechanism described in Section 20,10.5. 



current timeslot 
pointer 











n 


a+1 
















1 ► 













Figure 76. Timeslot based arbitration 



20.10.2 Separate read and write arbitration windows 

For write accesses, except the CPU, 256-bits of write data are transferred from the SoPEC DIU write 
requestors over 64-bit write busses in 4 clock cycles. This write data transfer latency means that writes 
accesses, except for CPU writes, must be arbitrated 4 cycles in advance. The [to be included figure and 
explanation] shows why this is necessary. 
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Since wntearbttranon must occur 4 cycles in advance, and the minimum duration of a timcslot duration is 
3 cycles d.e arbitration rules must be modified to initiate write accesses in advance.accoiSnglyTS s 
a tm«slot lookahead pomter shown in Figure 77 two timcslots in advance of the current timSfofpo^ 



current timeslot 
pointer 



n+1 



timeslot lookahead 
pointer 



n+2 



Figure 77, Timeslot based ai1>itration with separate read and write pointeis 

The following examples illustrate separate read and write timeslot arbitration. 



W 



W 



R 



W 



Programmed timeslot order 



W 



W 



W 



w 



w 



Timeslot arbitration order 



Actual timeslot order 



write 
latency 



Figure 78. Example (a), separate read and write artiltratlon 

In Fig^ejs writes are arbitrated two timeslots in advance. Reads are arbitrated in the same cycle Writes 
sSSrt^lStipSaSi ^ """^ ^^"^^ ^^^^ ^^'^ 

S^rrr''^"' '^T ^ ^^e^V^' ^'^^ 80 Figure 81. TTie actual timelsot order is always the 
same as the programmed tmieslot order ,.e, out of order accesses do not occur and data coherency is never 
an issue. 

Each write must always incur a latency of two timeslots. If the first write occurs in the first timeslot then 
a^l foUowmg timeslots wiU mcur a latency of two timeslots. This is shown in Figure 78 and Figure 79 If 

Irm.S^L'^*- •i^^^'lot^ «hen all following timeslots will incur a latency of two 

tmieslots. This is shown in Figure 80. 
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w 



w 



w 



Programmed timeslot oieder 



w 



w 



w 



Timeslot aibitration order 



W 



R 



W 



w 



Actual timeslot order 



write 
latency 



Figure 79. Example (b), separate read and write arbitration 



W 



W 



w 



Programmed timeslot oreder 



R 
W 



W 



W 



Timeslot arbitration order 



W 



W 



W 



Actual timeslot order 



write 
latency 



Figure 80. Example <c), separate read and ¥w1te arbitration 
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R 



w 



w 



w 



R 




W 



W 



Programmed timeslot oreder 



Timcslot arbitration order 



Actual timeslot order 



initial write 
latency 

Figure 81, Example (d), separate read and write arbitration 



Table 70 shows the 4 scenarios depending on whether the current timeslot and timeslot lookahead pointers 
point to read or wnte accesses. 

To be checlced and update± 

70: Arbitration with separate windows for read and write accesses 




read 



read1 



write 1 



write 



write 



reads 



write2 



read 



Initiate read transfer. 



Initfate write transfer. 



Initiate readi transfer. 



(nitrate wrrte2 transfer. 



No action. 



If the current timeslot pointer points to a read access then this will be initiated immediately. 

If the timeslot lookahead pointer points to a write access then this access is initiated immediately, or 
immediately after the read access associated with the current timeslot pointer is initiated. 
When a write access is initiated the DIU will capture the write address and will do the DRAM write two 
tiemslots m advance when the associated write data has been iransfercd to the DIU. 

To be checked and updated: At initialisation, both pointers point to the first timeslot The lookahead 
pointer advances to the second timeslot and the third timeslot in successive clock cycles until it is two 
^'^'''1''*"*^^''"^''^ timeslot pointer. Then both pointers advance in tandem. At each step, the 
rules m Table 70 are obeyed. This leads to the behaviour shown in the exampes of Figure 78 to Figure 81 . 
CPU write accesses arc excepted from the lookahead mechanism. 
Timing diagrams for these scenarios are shown in Section 20.13 Implementation. 
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20.10.3 Arbitration of CPU accesses 

The CPU can be allocated timeslots like any other DIU requestor. If CPU accesses are interleaved between 

'"'l"*'*''"^' » CPU requires minimum latency 

DRAM access ..e. preferably the CPU should get the next available timeslot whenever it requ™ 

r^e^'^"Sf nnHl.'^''' ■ 'f'*'^*'^ ^ 71. Tins is the time between the CPU making 



Table 71. Estimated CPU read access latency ignoring caching 





register the CPU read 
request 


I cycle 


complete the arbitra- 
tion of the request 


I cycle 


transfer the read 
address to the DRAM 


1 cycle 


DRAM read latency 


2 cycles 


register the read data 


1 cycle 


TCDTAL 


6 cycles 



" thf rp^r* " '^'^""'^ ^^^'^ immediately after receiving data from the DIU then 

^^3 c^cirScPU^^^^^^ T"? - thatSnesloU 

^L^^^^?*! °-'"^^fn?y it wiU have to wait 

IS^d i^y'SSrac^iS: "'"'^ "'^'-^ °^ -^-^ — - 

To avoid the CPU having to wait for its next timeslot it is desirable to have a mechanism for ensuring that 
the CPU always gets the next available timeslot without incurring any latency on the non-CPu3o2^^ 
Thu can be done by defining each timeslot as consisting of a CPU access preceding a non-CPU access 
E^h timeslot w,ll last 6 cycles i.e. a CPU access of 3 cycles and a non-CPU accesf of S^yci™ TOs fs 

SLt'lU V T°l'^?r"' " 20.7.2.2. If the CPU does not require Tacc^the 

timeslot will ti&e 3 or 4 and the timeslot rotiition wiU go faster. A summary is given in T?ble 72 

Table 72. Timeslot access times. 



CPU access + non-CPU access 


3 + 3 = 6 cycles 




non-CPU access 


4 cycles 


Interleaved access 

Access and preceding access both to shared 
read bus 
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Table 72. TImeslot access times. 







3 cycles 


^^dlb^ preceding access not both to shared 


CDU wnte access 


3+1+1+1 =6 cycles 


Page mode select signal is clocked at 320 MHz 



i^r*tt • . — -y-'w. wiiic ttwcsses preceaea Dy a CPU access reauire 9 cvcl^c 
CDU timeslots therefore take longer than all other DIU requestors timeslots. ^ ^ 

With a 256 cycle rotation there can be 42 accesses of 6 cycles. This is just enough timeslots for SF - 4 
operation, ignoring implementation pipeline latencies. nmesiots tor SF - 4 

SLinn "'IL^^'''' applications, it is desirable to have more timeslots available in the same 256 cycle 
rotol^oo. So two counters of 4-b.ts each are defined aUowing die CPU to get a maximum of c/u ZZTs 
in total_,nmeslots. A tnneslot counter starts at totaLtimeslot. and decrements every tim^lot Se^^theJ 
counter starts at cpuMots and decrements every timeslot in which the CPU LsT^ceT^^ 
CPU tmies ot counter goes to zero before totaUimeslots no Anther CPU accesses are alWed V^eJ Ae 
total^nmeslots counter reaches zero both counters are reset to their respective Si valueT 

When cpu^timeslots is set to zero then no accesses will be preceded by CPU accesses The rpii k*. 
allocated timeslots like any other DIU requestor. P«^«°ea oy u accesses. The CPU can be 

If CPU accesses are interleaved between the shared read bus accesses then the DIU timeslots will take 3 

The various modes of operation are summarised in Table 73 with a nominal rotation period of 256 cycles. 
^""^^^ timeslot allocation modes with nominal rotation period of 256 cycles 




CPU Pre-access 
i.e. cpujtimeslots = totaljtimeslots 



Fractional CPU 
Pre-acc ess 

i.e. cpujtimeslots < total jtimeshts 



Interleaved 

I.e. cpujtimeslots ■■ 



6 cycles 



4 or 6 cycles 



42 timeslots 



42-64 timeslots 



Each access is CPU + non-CPU. 
If CPU does not use a timeslot then rotation fs foster. 



Each CPU + non-CPU access requires a 6 cycle 
timeslot. 



4 cycles 



64 timeslots 



Individual non-CPU timeslots take 4 cycles if 
current access and preceding access arc both 
to shared read bus. 



Individual non-CPU timeslots take 3 cycles if 
current access and preceding access are both 
to shared read bus. 



Timeslot rotation is faster by 1 cycle for each 
CPU, write access or interleaved read access 



20.10.4 Sub-timeslots 



b^^^d^dt^s^/^ requirements of the DIU requesters in Table 67. most DIU requesters require 

bandwidths of 1 bit/cycle or mulUples thereof. However, some of the requestors require much lower band! 
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widd..^This suggests that some sub-timeslots of lower gnmularity than a nominal , bit/cycle should be 
Table 74. Sub-tlmeslot deffnition 

















SuMtimeslot 


4 


0.25.bits/cycle 




SubStimeslot 


8 


0.125 bits/cycle 





Each sub-slot pointer gets advanced each time it is accessed regardless if it slot is used or not 
Sub-timesIots arc similar in aU other ways to main timeslots i e 
* <=»n l»ave preceding CPU accesses in a similar manner 

. unused slots are decided by the same unused timeslot allocation mechanism (Section 20 10 5) 



current timeslot j 
pointer 




sub4timeslot 















I 


2 


3 


4 






2 


3 



sub3timeslot 



Figure 82. Example sub-trmeslot allocation 

An example sub-timeslot allocation is shown in Figure 82 

— ^^^^^^ 

..w'Le./or Srtin ° " -W«««tor will win arbitration and the 

20.10.5 Allocating unused timeslots 

Unused slots are re-allocated on a two-level round-robin basis. This is best-effort traffic. 
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Each SoPEC requestor has two associated bits. RoundRobinLevel indicates whether it is in levrf I or 
2 round-roba,. and Roun^binEnable indicates whether it is enabled or noHn i sdi^d rol^^^^^ 
Table 75. Round-robfn selection 



RoundRobinLevel = 0 



RoundRobinLevel - 1 



RoundRobinEnable ~ 0 



RoundRobinEnable = 1 



RoundRobinEnable 0 



RoundRobinEnable « 1 



Not enabled 



Level 1 



Not enabled 



Level 2 



Separate read and write round-robin trees are needed, one for read accesses and one for write accesses 
CDU write accesses cannot be included in the round-robin allocation for write as CDU accesses take 6 
cycles. The wnte accesses which the CDU write could otherwise replace require o^y^or??^^^^^^ 
Robin-robin allocations do not have CPU prc-accesses. 

A pointer points to the current allocated unit in each of die round-robin levels. If the unit pointed to the 
Zt JT l^'? requesting then this unit w,ns the arbitration and the poimer is JZ^Ti^^t 
L? ^ '^^^ ^^"^^ ' round-robin is not requesting then-the next units u, the level T^SSd-^jbrnTe 
ne^t If^" -«iuesting unit is found this unit wins the arbitxaHon and the pointed Z^c^ to Z 

l^t " "^""^ ^'^ P^"*"' ^'''^^ secondTevel of round-robin 

examined in the same way as first level of the lound-robin. 



Table 76. Write round-robin registers bit order 







CPU(W> 


0 


SCB 


1 


SFU(W) 


2 


DWU 


3 



20.10.6 Background refresh controller 

tlt^r^"'^'^ 'f^'^ ^ implemented that wiU issue a refresh and pause the timeslot 

ri"w^r^Lr:.L^ ™^ -^-^^ — - --^oi that insulBcie:: 



Doc: SoPEC_hardware_desrgn 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 220 



SoPEC : Hardware Design 



20.11 Guidelines for programming the DIU 

W sidelines for progrnnuning the DIU arbitration scheme are given in this section together with an 

20.11.1 Implementation pipeline latencies 

i^encl^T^Se '^^IS^I? *° '''^^ implementation pipeline 

SS^for ^nleZ^I^ Tf ' ^'°erm^l.. This means I or 2 timeslots can be remoSo 

p„„ "nplementation latency. Each timeslot wiU allow for 6 cycles implementation latencyTS'S 
^^^i T ^ f^"" ^'"^ » a rotation iSte^S^ 

s:; s^jLr ^^^'^ ^^^^ 

20.1 1.2 Ensuring sufficient DNC and PCU access 

hl^ "<!^Tf°** ^ exceptional events and should complete in as short a time as nossi 

ble. Similarly, we must ensure there is sufficient free bandwidth for DNC accesses Ta Vi^tZf r 
dead nozzles occur. In Table 67 DNC is allocated 3 times avem^ ba^widrp^l^ DN?^^^ 

20.1 1.3 Basing timeslot allocation on peak bandwidths 

LBD ?!s1^Sfbtt°c^'f r:''" '^^^'^^'^ *° ' <^'npression rates for the CDU and 

nSt K, A "^l ^ simplify the mam timeslot and sub-timeslot allocation by basing the allocation^n 
peak bandwidths. The only variable in detemuning timeslot aUocations then becomes AeTcale Sr 
If slot allocation is based on peak bandwidth requirements then DRAM access will be guarunteed to all 
Uie peaks determmtsticalfy by adding some cycles to die print line time. 

20.1 1 .4 Adjacent timeslot limitations 

^li^^VhTti^to n^rf "^'r '"^^'^^ ^hi-^l^ transfer the read or write data before requesting 

^am. The time to perform this operation is greater than the time between adjacent timeslots Tl^^ 

20.11.5 Line margin 

12t'256"h!?lS!T* ^^^^y'^V° *^ ^""'^ HCUNumDots may not be a multiple of 256 bits the 

iart 256-bit DRAM word on the line can contain extra zeros. In this case, the SFU may not be able to nro 
vide 1 b.t^<ycle to the HCU. This could lead to a stall by the SFU. This stall coidraerlronSate if^ 

calculation. DRAM service penod - X scale factor * dots used from last DRAM read for HCU line. 
Similarly if the line length is not a multiple of 256-bits then e.g. the LLU could read data from DRAM 
v^chco^..^ padded zeros. T1.S could lead to a stall. This stalfcould then propa^ iftl p^^m^^J 

A single addition of 256 cycles to the line length will suffice for all DIU requesters to mask these stalls. 
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20.11,6 Example DIU programming 

A full example to be worked out. 



''r'f^^ln'^?'''^'^^'''.^^"'^ 5i/Z>/T77mw/o/ configuration registers (Table 82) for peak required bandwidths 
of SoPEC Units according to the scale factor used for the document. 

Program unused slots to use the round-robin allocation to share unused slots betvveen all DIU requesters. 
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20.12 CPU DRAM ACCESS PERFORMANCE 

This section does not yet reflect any implementation pipeline latencies. 

I?Jth''I^'«.tl'>'*^ '""''^ ^ ''"^ Of guanu:teed bandwidth and average band- 

The CPU's access rate to memory depends on 

• how often it can get access to DIU timeslots. 

Table 71 estimated the CPU read latency ignoring caching as 6 cycles. 

How often the CPU can get access to DIU timeslots depends on the access type. 

Table 77. CPU DRAM access performance 




Fractional CPU 
Preraccess 



6 cycles 




Lower boimd (guaranteed 
bandwidth) is 
160 MHz / 6 = 2627 MHz 

Lower bound (guaranteed 
bandwidth) is 
(160MHz*N/P) 




CPU can access every timeslot 



CPU accesses precede a fraction N of timeslots 
where N = C/T. 
C = cpu^timeslots 
T = totaljtimeslots 
P=^(6*C-i-4*(T-C))/T 



20.12.1 CPU DRAM access performance with interfeaved access mode 

Table 78 shows the guarantee./ periodic CPU access with 4 cycle DRAM access rndpclk- 160 MHz. 





mm 




Timeslots left for CPU 


2S.25 


21.5 


Maximum wait tor timeslot 


12 cycles 


1 2 cycles 


CPU rato 


13.3 MHz 


13^ MHz 



I MHz 
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^«^'ac^r ^^"^ "^"^ ^ ""^"^ "^"^ '^'^ f"^'' = 1 60 MHz. This will be a 



ISH^^l^^l^'''^ """^ ^ "y*"^ an** Pcf*= 160 MHz 





rorfic 






TimesJots left for CPU 


34.95 


30.8 


Maximum wait for limeslot 


Scydes 


12 cycles 


CPU rate 


20 MHz 


13.3 MHz 



Interleaving of CPU and write accesses with shared read bus accesses will mean some of the timeslots will 
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20.13 Implementation 

T,. ^^^^^V^^ (DIU) is partitioned into 2 .ogical block, to Militate design and verification. 

a. The DRAM Access Unit (DAU) which interfaces to the SoPEC DIU requesters 

b. The DRAM Controller Unit (DCU) which accesses the embedded DRAM. 



SoPEC 
Units 



1 

1 . 
1 


DRAM Access Unit (DAU) 




DRAM 


1 
1 


eDRAM 








Controller 


1 


1 






Unit 






1 
1 
1 






(DCU) 


1 
1 
1 




1. 






1 





Figure 83. DIU Partition 



The DCU is designed to interface with single bank 20 Mbit IBM Cu 1 1 i»r«>«.wH«w t^hakm ^ . 
random accesses everv 3 cvcles P^a^ moH* . embedded DRAM performing 

^^""^^ associated with the CDU, are also supported. 
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20.13.1 Definition of DCU IQ' 



Tabie 60. OCU interface 



Clocks and Resets 



pclk 



prst_n 

Inputs from OAU 



*n I SoPEC Functional dock 



In 



AcHve-tow, synchronous reset in pdk domain 



dau_dcu.cfndavafl 



dau_dcu_cnndadr(21 :5] 



dau.dcu.cmdfwn 



dau.dcu^cmdrefresh 



17 



In 



Signal Indicating a DAU command is available l,o. 
dau_cfT}d^aar, daa,cmcLAvn and cfau_cmdl__mfrBsh are vafid. 



Signal indicating the address for the DRAM access. This Is a 
256-blt aligned DRAM address. 



Signal Indicating the direction for the DRAM access n =read 
0=writ6). ^ 



Signal indteating that a refresh command is to be issued If 
asserted dau^cmd_adr and ctau^c md_nm will be ignored. 



256 



in [ 256-blt write data to DCU 
in I 256-bit write data mask to DCU 



256 



da u_dcii__wvaiid 



17 



Outputs to OAU 



dcu_dau.cmdaccept 



dcu_dau_refreshcomplete 



dcu.dau^rdata 



dcu_dau_rrvaBd 



Outputs to DRAM 



256 
1 



Inputs from DRAM 



In 



Signai Indteating valid write data and write mask. 



Out 



Out 
Out 



Out 



Signal indicating that the DCU has accepted a valid command 
from the DAU. 



Signal indicating that the DCU has completed a refresh. 



2S6-bft read data from DCU. 



Signal indicating vaiid read data on dcu^rdata. 
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20.1 3.2 Definition of DAU lO 



Table 81. DAU Interface 











viuvru» ana neseis 




1 1 


In 


SoPEC Functional dock 


prst__n 


1 1 


In 


Active-low, synchronous reset in pdk domain 






cpu_acfr(9.'2) 


8 


In 


CPU address bus. 8 bits are required to decode the 
address space for this block 


cpu_dataout[3l:0] 


32 


In 


Shared write data bus from the CPU 


diu_cpu_data[31 :0) 


32 


Out 


Conflguiatton. status and debug read data bus to the CPU 


cpu.rwn 


1 


In 


Common read/not-write signal from the CPU 


cpu_acode[1:0] 


2 


In 


CPU access code signals. 

cpu_acode[OI - Program (0) / Data (1) access 

cpu.acodell] - User (0) / Supervisor (1) access 

The DAU will only aJlow supervisor mo6e accesses to data 

apace. 


cpu.dlu.sel 


1 


In 


Block select from the CPU. When cpu^cflu^sel is high both 
cpu addrasKicpu^aataoutarevalM 


dlu_q)u_rrfy 


1 


Out 


Ready signal to the CPU. When cliu^cpu_ijcty\s high it Indi- 
cates the last cycle of the access. For a write cycle this means 
cpu^dataout has been registered by the btock and for a read 
cyde this means the data on diu^cpu_data is valid. 


diu_cpu_berr 


1 


Out 


Bus error signal to the CPU Indicating an Invalid access 


DIU Read Interface to SoPE 


:c Units " 


<unit>_diu_rreq - 


1 


In 


SoPEC unit requests DRAI\4 read. A read request must be 
accompanied fay a valid read address. 


<unit>_diu_radrI21 :5] 


17 


In 


Read address to DIU 

17 bits wide (256*blt aligned word). 


diu_<unlt>_rack 


1 


Out 


Acknowledge from DIU that read request has been accepted 
and new read address can be placed on <unit>_diu_mdr 


dlu_data(63:0] 


64 


Out 


Data from DIU to SoPEC Units ^cept CPU. 
First 64-bits is bits 63K> of 256 bit word 
Second 64-btt8 is bits 127:64 of 256 bit word 
Third 64-^15 Is bits 191:128 of 256 bit word 
Fourth 64-blts is bits 255:192 of 256 bit word 


dram_cpu_data(255K)I 


256 


Out 


256-bit data from DRAM to CPU. 


diu_<unit>_rvalid 


1- 


Out 


Signal from DIU telling SoPEC Unit that valid read data is on 
the diujdata bus 


DIU Write Interface to SoPEC UnHs 


<unit>_dlu_wfoq 


1 


In 


SoPEC unit requests DRAM write. A write request must be 
accompanied by a valid write address. 


<unit>_diu_wadr(21 :5] 


17 


In 


Write address to DIU except CPU, CDU 
17 bits wide (2S6-bit aligned word) 


cpu_adrt21:0] 


22 


In 


CPU Write address to DIU 

22 bits wfcJe (8-bIt aligned word) 

Addresses cannot cross a 2S6-bit word DRAM boundary. 
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Table 61. OAU Interface 







m 




cpu.diu_wmask[1 :0J 


2 


In 


Rag indicating format of CPU wnte to DRAM 
00: 8-bit write 
01: 16-t)lt write 
10: 32-bIt write 
1 1 : reserved 

cpu_a<fdf[2:0] are driven in accordance with the width off the 
data access Indicated hy cpu^diu_wmaslc Addresses cannot 
cross a 256-bit word ORAM boundary. 


Gdu.diu_wa(<ft21 :3] 


19 


In 


CDU Write address to OIU 

19 bits wide (64-bit aligned word) 

Addresses cannot cross a 2564)it %vord DRAM tx)undary. 


diu_<unlt>_wack 


1 


Out 


Acknowledge from DIU that write request has been accepted 
and new write address can t>e placed on <ut\it>jcSiujwadr 


<unit>_diu.data(63:0] 


64 


in 


Data from SoPEC Unit to DIU except CPU. 
First 64-bits is bits 63:0 of 256 bit word 
Second 64-bits is bits 127:64 of 256 bit word 
Third 64 -bits is bits 1 91 :128 of 256 bit word 
Fourth 64-bits is bits 255:192 of 256 bit word 


cpu_dataout(31:0] 


32 


In 


Data from CPU to DIU. 


<unit>.diu_wva(id 


1 


fn 


Signal from SoPEC Unit indicating that data on 

<unit>_dtu__dat3 is valid. 


Outputs to DCU 


dau_dcu_cnidavail 


1 


Out 


Signal Indicating a DAU command is available l.e. 
dau_cmd_^acln dau_cmd_rwn and dau_cmd^rBfr9Sh are valid. 


dau_dcu.cfndadrf21 :5] 


17 


Out 


Signal indicating the address for the DRAM access. This Is a 
256-blt aligned ORAM address. 


dau_dcu_cmdfwn 


1 


Out 


Signal indicating the direction for the ORAM access (1=read, 
0=write). 


dau.dcu_cmdrefresh 


1 


Out 


Signal Indicating that a refresh command is to be Issued. If 
asserted dau_cmd_adr and dau^cmd_fwn win be Igrwred. 


dau_dcu_wdata 


256 


Out 


256'bit write data to DCU 


dau_dcu_wmask 


256 


Out 


256-bJt write data mask to DCU. 


dau_dcu_wvalid 


17 


Out 


Signal widicating valid write data and write mask. 


Inputs from DCU 


dcu_dau_cmdaccept 


1 


fn 


Signal Indicating that the DCU has accepted a valid command 
from the DAU. 


<teu_dau_refreshoomplete 


1 


In 


Signal indicating that the DCU has completed a refresh. 


dcu_dau_rdata 


256 


In 


256-bit read data from DCU. 


dcu_dau_rrvalid 


1 


In 


Signal indicating valid read data on dcu_fdata. 



The CPU subsystem bus interface is described in more detail in Section 1 1.4.3. The DAU block will only 
allow supervisor mode accesses to data space (i.e. cpu_acodefJ:OJ = bl 1). All other accesses will result in 
diu_cpu_berr being asserted. 



Doc: SoPEC_hardware_design S3 Proprietary Document 

Version: 2.3 



29 Nov 2002 
Page 228 




SoPEC : Hardware Design 



20.13.3 DAU Configuration Registers 



Table 82. DAU configuration registers 







mm 




0x00 




1 


UXl 


A write to this register causes a reset of the 
DIU. 

This retf ster can be read to indicate the 
reset state: 

0 - reset in progress 

1 - reset not In progress 


0x04 


RefreshPeriod 


10 


0x000 


BackonoLinri rpffp<%h nrmtmlior 

When set to 0 background refresh Is off, oth- 
erwise value indicates number of cydes 
between each refresh. 


0x08 


NumMatnTimeslots 


7 


0x40 


Number of main timeslots (0-64) 


0x09 


CPUTimeslots 


4 


0x0 


CPimmeslots out of Totammeslots are 
available for the CPU. 


OxOA 


Totammeslots 


4 


0x0 


CPUTtmesiots out of ToialTlmesrots are 
available for the CPU. 


0x1 00-0x1 FC 


MainTimesIot 


[64115] 


0x00 


Programmable main timeslots (up to 64 
main timeslots) 


0x200-0x208 


Sut)3Timeslot 


[3flSl 


0x00 


Programmable sub- timeslots (3 timeslots 
timeslots) 


0x210-0x210 


SiJb4Timesfot 


[415} 


0x00 


Programmable sub- timestots (4 timeslots 
timeslots) 


0x220-0x234 


SubeTimeslot 


[6)[51 


0x00 


Programmable sub- timeslots (6 timeslots 
timeslots) 


0x300 


ReadRoundRoblnLevel 


12 


0x000 


For each read requester plus refresh 

0 s tevell of round-robin 

1 B 1evel2 of round-robin 



Each main timeslot and sub-timeslot can be assigned a SoPEC DIU requestor according to Table 83. Main 
tinaeslots can be assigned SoPEC units, refresh and sub-timeslots. Sub-timeslots can be assigned SoPEC 
units and refresh but not to other sub-timeslots. 



Table 83. SoPEC DIU Encoding 









None 


bOOOOO 


0x00 


Write 


CPU(W) 


bOOOOl 


0x01 


SOB 


bOOOlO 


0x02 


CDU<Vy) 


bOOOII 


0x03 


SFU(W) 


bOOlOO 


0x04 


OWU 


bOOIOI 


0x05 


Read 


CPU(R) 


bOOIIO 1 0x06 
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Table 83. SoPEC DIU Encoding 



CDU(R) 


boom 


0x07 


CFU 


bOIOOO 


0x08 


LBO 


bOIOOl 


0x09 


SFU(R) 


bOIOlO 


OxOA 


TEfTD) 


bOIOH 


OxOB 


TE(TFS) 


bOllOO 


OxOC 


HCU 


bOIIOI 


OxOD 


DNC 


bonio 


OxOE . 


LLU 


bOllll 


OxOF 


PCU 


blOOOO 


0x10 


Others 


Refresh 


blOOOl 


0x11 


Subltimestot 


blOOlO 


0x12 


Sub2timeslot 


blOOII 


0x13 


SubStimeslot 


blOlOO 


0x14 



ReadRoundRobinLevel and ReadRoundRobinEnable registeis are encoded in the bit order defined in 
Table 84. 

Table 84. Read round-robfn registers bit order 





CPU(R) 


0 


CDU(R) 


1 


CFU 


2 


LBO 


3 


SFU(R) 


4 


TE(TD) 


5 


TE(TFS) 


6 


HCU 


7 


DNC 


8 


LLU 


9 


PCU 


10 


Refresh 


11 
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20.13.4 OIU Partition 



• cpu_dhj.sdl- 
Cpu.addrf12:2]— f-^ 

qMi.dataout(3l:0r 
cpu_rwn- 
cpu_fc(2:0I ■ 

dhj_Gpu_data[31 :0H] 

diu_cpu.rdy^ 
diu^Cpu.tierr^ 



<unfb>jdliijrraq- 
<unit>_diu_wreq - 



dhi__<urut>_fack 4 
<unft>.dIu_rBdr(21:5] - 



dhj.<unit>.waclc< 
<unit>_diujwadit21 :5] — 



Gpu^ddrt21:0]« 



dju_cpu_data[255:0] ^ 
dlu^data(63:0}^ 

diu.<un(t>_rvalid < 

cpo_dlu_data{31.0] _ 
<unft>_dftj_data[63:0] ^ 
<unit>_d iu_wvalld— 
cpu_diu„wmaskll :01 — , 

cpu.addr(4:01 — 
odu_dlu_wadr(4:3) . 



CPU 

Interface 
and 
Arbitration 
Logic 



uru _grrt 



read_cmd_avaif ^ 

wr|te_cm Lavdil 

writ >_dat I .avail 



write .due{1 



Ccmmand 
Ml Itiplexor 



ddbuQLreq 

debuQ^enaWe 

debijg_starUadr 

debiio_end_adr 



.typBl3:0) 



Debug 




debu{L.diu.wadrt21 :5} 
diu.debua.wack 

dcd-^ej 



dcu_cmd,acoept 



>fjOsh_ 



dau_cmd_aval! 



dau_cfnd_refresh 



dau_cmd_adr 



dau_cmd_rwn 



Read 
Write 
Data 
Multiplexor 



dcu^rvalid 



dcu_rdata 



dau_wvafid 



dau.wdata 256^ 



.cot tplete 



dau^wmasX 25^5^ 



road_adr_sel 
^ read.complete 



Debug 



-> writa^adr.sel 
write_complete 



debug_wdata(2S5:0} 
debug_wvalid 



ORAM Access Unit (DAU) 



Figure 84. DJU Partition 



DRAM 
Controller 
Unit 
(DCU) 



eDRAM 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 231 




SoPEC : Hardware Design 



20.1 3.5 CPU Interface and ArbitraUon Logic Sub-block 



Table 85. CPU Interface and Arbitration Logic Sub-block tO Definition 



l5M^^^^M^Pi# 






aocfcs and Resets 


pdk 


1 


In 


System CJock 


pr8t_n 


1 


In 


System reset, synchronous active low 


CPU Interface data and control signals 


cpu_addr(12:2] 


11 


In 


CPU address bus. 1 1 bits are required to decode the 
address space for this bk>ck 


cpu_dataout(31:0] 


32 


In 


Shared write data bus from the CPU 


diu_cpu_datain(31 :0] 


32 


Out 


Read data bus from the DIU to the CPU 


cpujrwn 


1 


In 


Common read/not-write signal from the CPU 


cpu_fc[2:0] 


3 


In 


CPU Function Code signals. 


cpu_diu_8el 


1 


In 


Block select from the CPU. When cpu_diu^sel\3 high both 
cpu^addranfS cpu_dataout are valid 


dlu_cpu_rdy 


1 


Out 


Ready signal to the CPU. When diu_cpu^rc/y Is high (t Indi- 
cates the last cyde of the access. For a write cyde this 
means cpa^dafaocuhas beeri registered by the block and 
for a read cyde this means the data on diu cpu datain Is 
valid. 


diu_cpu_berr 


1 


Out 


Bus error signal to the CPU Indicating an Invafid access. 


DIU Read Interface to SoPEC UnHs 


<unlt>_dhi_rreq | i 


in 


SoPEC unit requests DRAM read. 


DIU Write Interfece to SoPEC Units 


<unit>_dtu_wreq 




In 1 SoPEC unit requests DRAM write. 


Internal Inputs from other DAU blocks 


re^aibttrate 


1 


In 


Signal telling the arbitration logic to choose the next artiitra- 
lion winner. 


debug_rep 


1 


In 


DIU request signal from Debug logic. 


Internal Outputs from other OAU blocks 


debug_enable 


17 


Out 


1 ~ Enable DIU Debug 
0 = Disat}ie DIU Debug 


debug.8tart_adr(21 :5] 


17 


Out 


DIU detMjg start address. 


debug_end_adft21 :51 


17 


Out 


DIU dekKjg end address. 


unlt_gnt 


1 


Out 


Signal lasting 1 cycle which indicates arbitration has 
occurred. Indicates adr^seland wnfe.due are. valid. 


adr_sel{4:0] 


5 


Out 


Signal indicating which requesting SoPEC Unit has won 
arbitration. 
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Tabfe 85. CPU Interface and Arbitration Logic Sub-block lO Definition 









ill 




wnte_due(1 :0] 


2 


Out 


write_due[OJ indicates if the next aitmration winner will t>e a 
write aocess. ivsnffe„d(/e/YJ indicates if the subsequent arbi- 
tration winner wlil be a write access. Valid on urttt_gnt 
write^tlue[1]\s only required where 2 cyde random ORAM 
access Is possible. 


acce$s_type(3.*0] 


4 


Out 


Signal indicatinfl the origin of the winning arbitration 

O000=main timeslot 

0001 ssubl timeslot 

0010=:sub2t{meslot 

0011=sub3t]meslot 

0 1 00=sub4timeslot 

OlOli^round-robin level 1 

01 10=round-robin t6vel2 

0111=priority 

1000=:cpd round robin 



20.13. 5.1 CPU interface and Arbitration Logic 5ud-b/oc/r Description 

The CPU Interface and Arbitration Logic sub-block is shown in Figure 85. The CPU interface sub-block 
provides for the CPU to access DAU specific registers by reading or writing to the DAU address space. 

The CPU subsystem bus interface is described in more detail in section Section 1 1 .4.3. The DIU block will 
only allow supervisor mode read or write accesses to data space (i.e. cpujc[2:0] = blOl). All other 
accesses will result in diu^cpujberr being asserted. 
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<unlt>_diu,freq- 



<unft>_<ftu_wreq - 



r8_ait>itratQ 



cpo_fwn— ► 



cpu.dhcsel 
cpu_Addr(12: 

cpu_dataout[31:0] 
cpo_ec(2:01-^ 
diu_cpu_daia[3 1 :0X 
diu_cpu_rdy<4- 
diujcpu.berr<# 




CPU 

Interface 



00 nftgu ration 
^ ► 



Arbitration 
Logic 



ref_rst 



^ ¥ 









> 








> 










> 










> 





urUt_Bni 



, adr.sel 



wrlte^due 



acoess.type 



Refresh 
Counter 



deb<jg,enable 



debug.req 



debuo_start,adr[2l :51 



debug_end,adft21:S) 



Figure 85. CPU Interface and Arbitration Logic 

Arbitration is triggered by the signal re^arbitrate with the signal unit^nt indicating that arbitration has 
occurred and the arbitration winner is indicated by adr_self4:0J. Arbitration should take 1 clock cycle so 
unit^nt is asserted the clock cycle after re_arbitrate and stays high for 1 clock cycle. adrjsel[4:0] 
remains persistent until arbitration occurs again. The arbitration timing is shown in Figure 86. 



pdk 

re_arbitrate 
imit_gnt 

adr_sel[4:0] 
write_due[l;0] 



J L 



1 00000 1 


01000 


00001 




00 1 


01 


00 



Figure 86. Arbitration timing 



The basic arbitration table is 64 entries of 5 bits. Arbitration works by having a pointer advance to the next 
entry in the table whenever re^arbitrate is asserted. Four of the main slots can be assigned to a set of 4 



Doc: SoPEC_hardware_deslgn 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 234 



SoPEC : Hardware Design 



SI 



sub-slots. Each of these sub-slots requires 4 entries of 5 bits and a pointer. The pointer advances along the 
sub-entiies every time arbitration selects a sub-slot. Slots can also be assigned to round-robin or priority 
allocations. 

write_due[l:0] will point to timeslots 1 and 2 arbitration cycles in the future. write_duefl:OJ looks ahead 
in the arbitration and indicates if the next arbitration winner (based on currently requesting write request- 
ers) and the subsequent arbitration winner will be a write access. The write^duefJ. OJ functionality is 
added to allow write transactions be selected early by the command multiplexor sub-block to offset the 
latency of transferring the write data over the vmte data busses before the DRAM access can occur This is 
only required for IBM DRAM. With Toshiba and Philips DRAM, the DRAM access latency masks any 
latency in transferring the write data. 

If an assigned slot is not used (because its corresponding SoPEC Unit is not requesting) then it can be re- 
assigned in a number of ways. Slots can be re-assigned using a round-robin or a priority based assignment 
Round-robin assignment can have up to 21 round-robin slots. Its implementation requires a pointer to keep 
track of the currently assigned round-robin slot. If the slot pointed to is not requesting then the next round- 
robin slot is considered and so on. Round-robin is efifectively a priority assignment with the slots assigned 
a priority according to the slot order. Round-robin and priority arbitration will be calculated in a hierarchi- 
cal manner shown in Figure 87. If no round-robin slots are requesting then DRAM access is re-assigned 
according to priority. 




Figure 87. Hierarchical priority based arbitration calculation 



It may be desirable to have 2 levels of round-robin arbitration. If there is no requester in the first level, then 
the arbitration looks at the second level. If there is no second-level requester then the DRAM access is 
assigned according to priority. 
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There is a background refresh counter which is reset whenever the arbitration logic selects a refresh. If the 
refresh counter reaches refresh_period[9:0] it stops counting and asserted a signal refjssue which forces 
the next aifoitration winner to be a refresh. This will delay any fixed timeslot allocation by one slot. If the 
slot was being re-allocated to a non-fixed timeslot requester then refresh will win the arbitration. 

Debug can also request DIU access via the signal debug^req, Dili debug access when enabled will obtain 
DIU access at the expense of CPU DIU slots. 



20.1 3,5.2 Arbitration of writes 



For write accesses, except the CPU, 256-bits of write data are transferred from the SoPEC DIU write 
requestors over 64-bit write busses in 4 clock cycles. This write data transfer latency means that writes 
accesses must be arbitrated two timeslots in advance. Figure 88. which repeats the DIU write protocol 
shown in Figure 73, shows why this is necessary. 

If this were a read access, then the read address is captured by the DIU in cycle 4 and presented to the 
DRAM in cycle 5. The read access at the DRAM will start in cycle 5. This corresponds to timeslot n+2. A 
write access cannot start until all the write data is available i.e. imtil cycle 9. This is a 4 cycle delay. The 
write access at the DRAM will not start until cycle 11 which corresponds to the start of timeslot 
Therefore, write arbitration must occur 2 timeslots in advance and incurs an additional latency of 2 cycles. 

The exact timing of read and write accesses will be outlined in Section 20.13 Implementation. 



pclk 

<unit>_diu_wreq 
<unit>_diu_wadr[21 :5] 

diu_<unit>_wack 
<unit>_diu_data(63 :0] 
<unit>__diu_wvalid 



I 1 I 2 I 3 m 







n+1 




n+2 1 




n+3 




n+4 




1 1 


2 


3 1 4 


5 


6 


7 


8 


1 9 


10 


11 1 



Figure 88. Write Protocol shown for a SoPEC Unit maldag a single Z56-bIt access 
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20.1 3.6 Command Multiplexor Sub-biock 



Table 86. Command Multiplexor SiJb*block lO Definition 









mmmmimmmmB^ 


Clocks and Resets 








pdk 


1 


In 


System Clock 


prst^n 


1 


In 


System reset synchronous active low 


DIU Read Interface to SoPEC Units 


<unit>_diu_radft21:5] 


17 


In 


Read address to DIU 

1 7 bits wide (256-bit aligned word). 


diu_<unl^.rack 


1 


Out 


Acknowledge from OIU that read request has been 
accepted and new read address can be placed on 
<unil>^diu^radr 


01U Wme Interface to SoPEC Units 


<untt>_diu_wadr(21 :5] 


17 


In 


Write address to DIU except CPU, SC6. CDU 
17 bets wide (256-bit aligned word) 


cpu_addr(21 :0j 


22 


In 


CPU Write address to DIU 

22 bits wide (8-bit aligned word) 

Addresses cannot cross a 256-bit word DRAM boundary. 


cdu_diu_wadr(21:3] 


19 


In 


CDU Write address to OIU 

19 bits wide (64-btt aligned word) 

Addresses cannot cross a 256-blt word DRAM boundary. 


diu_<unit>_wack 


1 


Out 


Acknowledge from DIU that write request has been 
accepted and new write address can be placed on 

<unit>jdiu_ wadr 


debug_dru_wadr(21 :5] 


17 


In 


Debug write address to DIU 

1 7 bits wide (256-bit aligned word) 


dlu_debug_wack 


1 


Out 


Acknowledge from OIU that debug write request has been 
accepted and new write address can be presented. 


Internal Inputs 


unit_gnt 


1 


In 


Signal lasting 1 cycle which indicates artiitratton has 
occurred. 


adr_8el[4:0] 


5 


In 


Signal cndtoating which requesting SoPEC Unit has won 
art>itiation. 


write_due[1 .0) 


2 


In 


wtite_due[0] Indicates if the next arbitration winner will be a 
write access, ivnfe^duaff 7 indkiates if the subsequent arbi- 
tration winner will be a write access. Valid on unit^gnt 
write^duefl) rs only required where 2 cycle random ORAM 
access is possible. 


read„cmd_avatl 


1 


In 


Signal indtcaUr>g that command multiplexor can issue read 
accesses. 


write_cmd_avail 


1 


In 


Signal indicating that command multiplexor can issue write 
accesses. 


write_data_avail 


1 


In 


Signal indicab'ng that vatid write data is available for the cur- 
rent command. 


Internal Outputs 


r©_arbllrate 


1 


Out 


SignalUng telling the arbitration k>gic to choose the next arbi- 
tration mnner. 


Signals from DCU 


dcu_cfnd_accept 


1 


In 


^ Signal indk^ating that the DCU has accepted a valid com- 
mand from the DAU. 
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Table 86. Command Multiplexor Sub-block lO Definition 





!MS1ES1 




signals to DCU 


dau.cmd.avail 


1 


Out 


Signal indicating a DAU command is available i.e. 
dau_cmd_adr, davLcmd^rwn and dau_emd^mfresh are 
valid. ~ 


dau.cmd_adr(21 :5] 


17 


Out 


Signal indicating the address for ttie DRAM access. This Is a 
256'bit aligned DRAM address. 


dau_cmd,rwn 


1 


Out 


Signal indicating the direction for the DRAM access. 


dau.cmd.refresh 


1 


Out 


Signal indicating that a refresh command Is to be issued. If 
asserted cmcLatfrand cmd^rwn will be ignored. 



20,13.6.1 DAU'DCU interface Description 

dau_cmd_avail indicates that the Conimand Multiplexor has a valid command to issue. When 
dau_cmd_flvail is asserted the signals dau_cmd_adr[2l:5], dau_cmd_rM^n and dau^cmd^refresit are valid. 
In the case of a write command, dau^cmd_avail will not be asserted until the Read-Write Data Multi- 
plexor sub-block has valid write data to supply, indicated by \vrite_data_avail, as well as a valid write 

address; 

The DCU indicates that it has accepted a command by asserting dcu_cmd_accept for 1 cycle. This indi- 
cates to the Command Multiplexor that it can supply a new command to the DCU. The DCU cannot assert 
dcu_cmd_accept until the Command Multiplexor presents a valid command as indicated by 
dau^cmd^avail, 

20.13.B,2 Command MuWpiexor Sub-bioci^ Description 

The command multiplexor sub-block issues read, write or refresh commands to the DCU, according to the 
SoPEC Unit selected for eDRAM access by the arbitration logic. The command multiplexor also signals 
the arbitration logic to perform arbitration to select the next SoPEC Unit for eDRAM access. Re-arbitra- 
tion takes place, in general, when the DCU indicates on dcu_cmd_jaccept that it has accepted the previous 
command. 

A state-machine for the command multiplexor is shown in Figure 90. 
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S5 



^ Arbitration£nable=0 
reset=0 ^ 



A/bitrationEnable s 0 




ArbltrattonEnable = 1 



Figure 90. Arbftration and Address Transfer state-machine 

The states in Figure 90 are defined as follows: 
Table 87. Command Muttlptexor state description 







IDLE . 


Controller goes to this state on reset and when ArbltrattonEnable is 
de-asserted. 


RE-ARB 


When ArbitrationEnable is asserted and there is a DIU requester, 
assert re_arbitrate so Arbitration Logic will select source of next 
DRAM access indicated by adr^seL 


ACK 


Send acknowledge to source of next DRAM access indicated by 
adr^sel. 


ADR 


Receive DRAM address from source indicated by adrjsel and in 
the next cycle place it in command queue along with adr _jel. 



In the ACK and ADR states of Figure 90, the signal adr_jel is used to multiplex between the SoPEC Units 
to capture the eDRAM address of the winning requester, as illustrated in Figure 91. The winning address is 
written into a command queue together with adr_seL If the winning requester is refresh then no address is 
written into the command queue. 
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S5 



SoPECUnitI 



SoPEC Unit 2 



SoPEC Uratn 



COMMAND MULTIPLEX SUB-BLOCK 



adr 



ack 



acfr 



ack 



adr 



ack 



command 
srnuttiplexor 



command 
queue 



/ 

adr.sell ad* 



1 


2 


3 








control 



cmd 

issuing 

logic 



dau_cmd_avail 



dau_cnid_adr 



dau_cmd_nvn 



dau cmd refresh 



' wrile^data.avail 



dcu_cmd_accept 



DRAM 

Controller 

Unit 



COMMAND MULTIPLEXOR 
STATE-MACHINE 



re.arfoitratlon 
logic 



re_arbitrate 

unit_gnt 

■ adr.sel[4:0] 
write_due[1 :0] 

wrtte_cmd_avail 
read_cmd_avai( 



Figure 91. Command multiplexor sub-block 

The command at the head of the command queue drives the command issuing logic which generates the 
signals required by the DRAM Controller Unit. The validity of the command is indicated by 
dau_cmd_avaiL The commands are captured by the DCU by a dcu^cmdjjccept strobe lasting 1 cycle. 
This signal also causes the command queue FIFO to be popped so that the next command is available to be 
be captured by the DCU. For a write command, dau_cmd_avail will not be asserted until there is also valid 
write data present This is bdicated by the signal wriee_data_avail from the Read-Write Data Multiplexor. 
Normally, the command queue wiU only have one filled location i.e. dcu_cmd_accept will cause a com- 
mand to be captured by the DCU and re-arbitration will be kicked off so as to provide the next command 
to the pCU in time for the next dcu_cmd_accept strobe. This is true for read and refresh commands. The 
timing is shown in Figure 92. It is assumed there is a pipeline delay between dcu_cmd_accept and 
re^arbitrate and a further pipeline delay between the address received from the SoPEC Unit and the 
address the DAU issues to the DCU. 

For refresh commands: 
'^^^^rhltraCG <= dcu_cmd_eLCcept AND (cosunand queue not full) . 

For read commands which use the shared read bus, we must also ensure that the read multiplexor logic is 
available to transmit the read data to the SoPEC read requester. This is indicated by the signal 
read_cmd_avaa which provides flow control from the read data multiplexor logic. So for read commands: 

^O—^^bitrato <= dcu_cmd_accept AND (command queue not full) AND 

{{iciad^adr^sel » shared read bus access) AND read^cmd^^vail) 
OR (csttd_adr_sel = CPU)) 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 240 



SoPEC : Hardware Design 



A shared read bus access is any read access except the CPU. 



pclkn 
dcu_cind_accept 
re^arbitrate 



"1 



unit_gnt/ack 

adr . 

command^queue 1 
cominand_queue2 
dau_cmd__adr 



1 


2 1 3 







1 ' 


1 


2 


1 3 


1 • 


1 


2 


3 



Figure 92. Command Muitiplexor sub-block timing for 2 cycle DRAM access read and refresh 

accesses 

In die case of a write command, the write data must be transferred from the SoPEC requester before the 
write can occur. Arbitration should occur early to allow for any delay for the write data to be transferred. 
Figure 73 indicates that write data transfer over 64-bit busses will take a further 5 cycles after the address 
is transferred. The arbitration logic produces the signals write_due[0] and writeuduefl] which point to 
timeslots 1 and 2 arbitration cycles in the future to indicate in advance future write accesses. 

For Toshiba and Philips 8, or 9 cycle DRAM write access no such future write arbitration is required. In 
that case 

re_^rh±t:rnte <= dcu_cmd_accept AND (connnand Queue not full) AND {( {cmd_ad2r_sGl = shared 
read bus access) AND read^avall) OR [ icmd_adr_sQl = write access) AND wrlte^cnu$_avail) } , 

For 2 cycle random DRAM access and 5 cycle write data transfer latency both write^duefOJ and 
write^duefJJ are required The command queue must be sized at 3 deep. If the timeslot sequence for 
DRAM access is a read, followed by a write, followed by a write, then arbitration should occur to select 
the read access immediately followed by re-arbitration twice more to select the future write accesses. The 
condition for re-arbitration in advance of the write access is 

re^Arbi trace <- (command queue not full) AND wrXCe^cmd^eivall AND (write^duefO] OR 
write_duQ(lJ) , 

write_cmd_avail provides flow control from the write data multiplexor logic. 
Open Issue 

The mechanism described here only pre-empts write accesses indicated by write_due[l:0] based on fixed 
timeslot allocation. If a write access is selected based on un-used timeslot re-allocation then it will intro- 
duce a latency in the access stream. One possibility to only disallow un-used timeslot allocation in the case 
of write accesses. An alternative is to use 256-bit data busses to transfer write data but this is likely to 
cause an increase the area of the DIU and/or decrease the chip wiring utilization. Another possibility is to 
have separate un-used timeslot reallocation logic for write accesses associated with write^due[l:0] i.e. 
write requests are always only considered 1 or 2 arbitration cycles in the future. In this case there is effec- 
tively a different time window for considering write requests and read requests. 
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20.1 3.7 Read and Write Multiplexor Sub-block 

Table 88. Read and Write Multiplexor Sub-block 10 Definition 



mam 



Clocks and Resets 



pdk 


1 


In 


System Clodc 


prst_n 


1 


In 


System reset, synchronous active low 


DIU Read fnterface to SoPEC Units 


diu.data[63:0] 


64 


Out 


Data from DIU to SoPEC Units except CPU. 
First 64-bits is bits 63:0 of 256 bit word 
Second 64-bits is bits 127:64 of 256 bit word 
Third 64-bits is bits 191:128 of 256 bit word 
Fourth 64-bits is bits 255:192 of 256 bit word 


dlu_cpu_data[255:0] 


256 


Out 


256-bit data from OtU to CPU. 


• dlu_<un]t>_rvatld 


1 


Out 


Signal from DIU telling SoPEC Unit that valid read data is on 

the cfiu^data bus 


DIU Write Interface to SoPEC Units 


<unit>.dlu.data[63:0] 


64 


In 


Data from SoPEC Unit to DIU except CPU. 
Rrst 64.blts is bits 63:0 of 256 bit word 
Second 64-bits is bits 127:64 of 256 bit word 
Third 64-bits is bits 191 :12B of 256 bH word 
Fourth 64-tiits is bits 255:192 of 2S6 bit word 


cpu.diu_data(3l :0} 


32 


In 


Data from CPU to DIU. 


cpu.acfdr(4:0] 


5 


In 


Lower bits of CPU Write address to indicate which byte 
within the 256-b(t DRAM word is selected. 


cpu_diu_wmask[1 :0] 


2 


In 


Rag indicating format of CPU write to DRAM 
cpujdiu^wmask = "OCT: 8-bit wrfte 
Cf>u_^diu_wmask = "01": 16^)it write 
cpu_diujmnask « "KT: 32-blt write 
cpu^diu_wmask = "11": reserved 

cpu_edd/[2:0] are driven in accordance with the width of the 
data access indicated by cpu__diu^wmaslc Addresses can- 
not cross a 256-tHt word DRAM boundary. 


<unit>_d i u_wval{d 


1 


In 


Signal from SoPEC Unit indicating that data on 
<unit>_diu_data is valid. 


Internal Inputs 


unil_gni 


1 


In 


Signal lasting 1 cycle which indicates ariMtration has 
occurred. 


adr_sel[4:0) 


5 


In 


Signal ir)dicating which requesting SoPEC Unit has won 
arbitration. 


Internal Outputs 


read_cmd_avail 


1 


Out 


Signal indicating that command multiplexor can issue read 
accesses. 


writo_cmd_avaH 


1 


Out 


Signal indicating that command muhipiexor can issue write 
accesses. 


write_data_avail 


1 


Out 


Signal indicating that valid write data is available for the cur- 
rent command. 


DCU Inputs 


dcu^rdata 


256 


In 


256-bi( read data from DCU. 


dcu_rrvalid 


1 


In 


Signal indicating valid read data on dcu_rxiata. 


DCU Outputs 
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Table 88. Read and Write Muttlplexor Sub-block 10 OeffnIHon 











dau_wdata 


256 


Out 


256-bIt write data.to DCU 


dau.wmask 


256 


Out 


256-bit write data mask to DCU tor IBM DRAM (byte masks 
are used for Philips and Toshiba DRAM). 


dau.wvalid 


17 


Out 


Signal indicating valki wrHe data and write mask. 


Debug signals " ' — 


debug^wdata 


256 


In 


256-bit debug wiite data 


debug.wvalid 


1 


In 


256-bit debug «vrite data valid 


read_adr.8el(4:0] 


5 


Out 


Signal indicating Die SoPEC Unit for which the current read 
transaction is occurring. 


read_comp(ete 


1 


Out 


Signal indicating that read transaction to SoPEC Unit indn 
cated by ma<i_a<fr^sel is complete. 


write_adr.8el{4:0] 


5 


Out 


Signal indicating the SoPEC Unit for whtoh the current write 
transaction Is occurring. 


write.oompleta 


1 


Out 


SignaJ indicating that write transaction to SoPEC Unit Indi- 
cated by wnte_adr^sei is complete. 
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20. 13,7. 1 Read Multiplexor logic description 



dcu^rvalid 



vfl 



dcu.rdata 



256, 



256, diu_q>u_data 
— y 



jyalid 



1 2 




256, 



64 


64 


64 


64 


64 


64 


64 


64 



diu_data 



64, 



64. 



diu.data 

•JU — 



\4 



rvalid 



SoPEC Unit 1 



diu_data 
^ 



rvalid 



SoPEC Unit 2 



diu.data 



rvalid 




SoPEC Unit n 



Figure 93, Read multiplexor logic 

There are 2 read channels - one for the CPU and a shared read bus for the rest of SoPEC. The shared read 
bus has buffering for 2 times 256-bits of read data i.e 256-bits of data can be received from the DCU while 
data is being transferred over the 64-bit shared read bus to the SoPEC Units. Once a read address is issued 
by the arbitration logic the adr_fiel[4:0J value is put into a read command queue in the read control logic. 
The queued adr^sel[4:0J values allow the dcu_rvalid and read data from the DCU to be directed to the 
correct source. In the case of the CPU bus dcu^alid and dcu_data can be multiplexed by the 
adr^el[4:0] value at the head of the FIFO direcdy to the CPU. If the incoming data goes over the shared 
read channel then the data is stored in a 2 deep 256-bit read data buffer and output over 4 cycles to the 
SoPEC requester when the previous transaction on the shared read bus is complete.' 

The depth of the adr_sel[4:0] read command queue is 2. When the queue is full no further adr_sel[4:0] 
can be accepted and no ftirther read commands can be issued by the command multiplexor to the DCU. 
This provides flow control back to the re-arbitration logic in the command multiplexor. The signal 
read^cmd^avail indicates that spaces are available in the read command queue 

The 2 deep command queue and the double data buffer means that the access rate will be limited to which 
ever takes longer - DRAM access or transfer of read data over the shared read data bus. Some extra logic 
may need to be added to time the assertion of read_cmd_avaii so that the read latency is kept to a mini- 
mum. 
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20. 13. 7. 2 Write Multiplexor logic description 



cpu_diu_wdata aa^ 



write data buffer 
1 2 3 




Figure 94. 

Once a write address is issued by the aibitration logic, the adr_sel[4:0] value is put into a write command 
queue FIFO in the write control logic. The queued adr_sel[4:Q] values allow the wvalid and write data 
from the SoPEC requester to be multiplexed to the DCU. The write multiplex logic is duplicated 2 times to 
provide two overlapping write channels. If 256-bit write data busses arc used then a single write channel 
which can be shared by CPU is all that is required. 

The depth of the adr^el[4:0] queue is 3. When the queue is full no fijrther adr _^€l[4:0] values from the 
CPU Arbitration block will be accepted. This provides flow control back to the re-arbitration logic in the 
Command Multiplexor sub-block. The 2 channels cannot select the same SoPEC write requester. 
write_cmd_avail is asserted whenever there is a space in the queue. There are 2 write pointers and I read 
pointer. Each of the channels has a write pointer associated with it. write _data_avail indicates that valid 
write data is available to be issued along with the address the command multiplexor will issue. 

There are 2 special cases for write accesses - CPU writes and CDU writes. 

CPU writes 

In the case of CPU writes the CPU write data bus is only 32-bits v^ide. cpu_diu_wmaskfJ:OJ indicates how 
many bits have to be written: 8. 16 or 32-bits. The associated address cpu_addr[21:0J is a byte aligned 
address. The acUial DRAM write must be a 256-bit access. The command multiplexor issues the 256-bit 
DRAM address cpu_addr[21:5], cpu_addr[4:0] and cpu_diujwmaskfJ:OJ are used to calculate the bit 
write mask wmaskf25J:0J for the write access. 
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8-biC write: cpu_ditx^wmask[l :0J = '00': 

,^ ^-^^""^ fl*cpu.ad(irf4.'0; to (8- (cpu^^ddr t4 :0) ^1} -1) of wmaBk[255:01 are aeacrted. 
1 6-bx t wri te : cpu^di u^wmask (1:0) = *01* : 

to (16Ucpu_^ddr[4:0}^l) -1) of wnuiski255:01 are asserced. 
22-bit write: cpu_dxu^wmask[l : 0] = '10': 

bits d-c:pu^addr{4:0J to (32Ucpu_iiddr{4:0) ^If -1) of wmask[255:0) are asserted. 

CDU writes 

Each CDU write access is a burst of 4 times 64-bits of write data to the 64.bits of the 256-bit DRAM 
address indicated by cdu_diu_y^adr[21:3] and the 3 subsequent 256-bit DRAM words. If these 4 DRAM 
words he m the same DRAM row then an efficient access wUl be obtained. The command multiplexor 
logic must issue 4 successive accesses to 256-bit DRAM addresses cdu_diu_wadrf2J:5J,-\'I +2 +5 
wmask[2S5:0] is calculated using cdu_diujwadr[4:3J i.e. bits 64*crfi/ diu yvadrM'Sl* to 
(64^(cdu_diu^wadrf4:3J-^I) -1) are asserted. " " 
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20.1 3.8 DIU Debug Sub-Block 

TablM9. DIU ug Sub-block lO Deflnltlon 











Clodcs and Resets 


pdk 


1 


In 


System Clock 


prst_n 


1 


In 


System reset, synchronous active k>w 


CPU Interface and Arbitration Logic 


detxjg.enable 


1 


In 


1 = Enable DIU Debug 
0 - Disable DIU Debug 


debug.8tart_adrt21 :5] 


17 


in 


DIU debug start address. 


debugLend_adr(21 :5] 


17 


In 


DIU debug end address. 


debug^req 


1 


Out 


DIU request signal from Debug logic. 


<uni1>_diu_rreq 


1 


In 


SoPEC unit requests DRAM read. A read request must be 
accompanied by a valid read address. 


<unit>.dtu_wreq 


1 


In 


SoPEC unit requests DRAM wnte. A write request must be 
aooompanied by a valid write address. 


unit^gnt 


1 


In 


Signal lasting 1 cyde which indicates aibitratkm has 
occunred. Indicates adc.se/and writa^due are valkJ. 


adr_sel[4:0] 


5 


In 


Signal indk^atlng which requesting SoPEC Unit has won 
art^itration. 


aocess.typ6[3:0] 


4 


In 


Signal indicating the origin of 0te winning arbitration 

O000=maintimeslot 

0001 =sub1 timeslot 

001 0=sub2tjmeslot 

0011=sub3timesiot 

01 00ssub4timeslot 

0101=round-robin levell 

0110=round-robin level2 

011l=prioflty 

lOOOscpd round robin 


DRAM Control Unit 


dcu_refresh.cofTiplete 


1 


In 


Signal indicating that the OCU has completed a refresh. 
Exact timing needs to be defined. 


Read Write Data Muttlptexor 


debug.wdata 


256 


Out 


256-bit debug write data 


debug^wvalld 


1 


Out 


256-blt debug write data valid 


read_adr_sel[4:0] 


S 


In 


Signal indicating the SoPEC Unit for which the cunrent read 
transaction is occurring. 


read.comptete 


1 


In 


Signal lndk:atlng that read transaction to SoPEC UnK indi- 
cated by read_adr_sel is compfete. 


write^adr_sel[4:0] 


5 


In 


Signal indicating the SoPEC Unit for which the current write 
transaction is occurring. 


write.complete 


1 


In 


Signal indicating that write transaction to SoPEC Unit indi- 
cated tiy write_adr^set\3 complete. 



External visibility of the DIU must be provided for debug purposes. To allow this special debug logic is 
added to the DIU. When DIU debug is enabled by debugjenable, the DIU debug sub-block will collect 
debug data. The DIU debug sub-block will itself have a 256-bit double buffer interface to the DIU. When 
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256-bits of debug data have been collected then the DIU debug sub-block will request access to DRAM by 
asserting debug_req. When the request is acknowledged on diu_debug_wack the DIU debug circuit will supply a 
write address, 256-bits of write data and a write valid. The arbitration logic will give the DIU Debug priority over CPU 
slots. This means the Debug 256-bit double-buffer will never overflow and debug will not affect the access of other 
blocks except the CPU (but this should not be important as debug will request with a low frequency and thexe arc 
niany timeslots assigned to the CPU). The DIU Debug circuit will generate a write address by incrementing (and 
wr^ping around) between debug_fitart^adr[2i:5J and debug_end__adrf2J:5J, 

Two kinds of debug information seem setisible to gather. 

a. The order and source of DIU requesters winning. This is obtained by storing Oiir jel[4:Q] 
along with the cuxes5jtype[5:0] every time unit is asserted. 

b. The time between a DIU requester requesting an access and completing the access. This infor- 



mation is obtained by having a counter for each DIU requester. The counter is reset and starts 
counting when the Unit starts requesting. The count is reset when the read or write access is 
complete as indicated by read_complete AND readjxdr_^el[4:Q] OR writejcomplete AND 
write^adr_fiel[4:0J, When refresh is complete this is indicated by dcu^fresh_complete. Typi- 
cally most SoPEC DIU requesters require an access every 256 cycles so a 1 0-bit counter is suf- 
ficient for most requesters. HCU and TE(TFS) require only a few accesses per line so in this 
case 15-bit counters are adequate. The count is returned along with the index of the SoPEC 
Unit 



The two kinds of debug information need to be both written to the DRAM debug charmeL This can be 
achieved by filling separate 256-bit double buffers* with the 2 sources of information. Each 256-bit word 
may contain un-used bits depending on the packing. The first bit of each 256-bit debug word will indicate 
which of the two kinds of debug information is contained therein (0 for a, 1 for b above). 
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PEP Subsystem 
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21 PEP Controller Unit (PCU) 

21.1 Overview 

The PCU has three functions: 

• The first is to act as a bus bridge between the CPU-bus and the PCU*bus for reading and writing PEP 
configuration registers. 

• The second is to support page banding by allowing the PEP blocks to be reprogrammed between bands 
by retrieving cotmnands from DRAM instead of being programmed directly by the CPU. 

• The third is to send register debug information to the RDU, within the CPU subsystem, when the PCU 
is in Debug Mode. 

21.2 IMTERFACES BETWEEN PCU AND OTHER UNITS 



CPU 



XXX_pcu_rdy — 

XXX.pcu_data 

pco_XXX_sel ^ 



pcu^dataout 



pcu_adr 
pcu.rvvn -4- 





cpu_adr ^^ , 






cpu.dataout ^ 32, i 




4- 


pcu cpu_data ^ ' 32 


-> 




..... . 1 y-- 

cpu_fwn 1 






cpu^pcu.sel • 






pcu_cpu_ndy \ 




4- 


cpu_acode 2, i 






paj_cpu_berr ^ J 






PCu_cpu_dabug_vaUfl 







CDU 




LBO 




TE 



cdu_flni$hedband 



ibd.finishedband 



state 



64, 



^ ^ ▼ 



DRAM 

interface 



t t t 



end of band 
unit 



PEP controller unit 



Interaipt Controller Unit T 

(tcu) j: 



Figure 95. Block diagram of PCU 



21.3 Bus BRIDGE 



The PCU is a bus-bridge between the CPU-bus and the PCU-bus. The PCU is a slave on the CPU-bus but 
is the only master on the PCU-bus. See Figure 1 3 on page 39. 
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21.3,1 CPU accessing PEP 



All the blocks in the PEP can be addressed by the CPU via the PCU. The MMU in the CPU-subsystem 
will decode a PCU select signal, cpu^cujsel, for all the PCU mapped addresses (see section 11.4.3 on 
page 70). Using cpu^adr bits 15-12 the PCU will decode individual block selects for each of the blocks 
within the PEP. The PEP blocks then decode the remaining address bits needed to address their PCU-bus 
mapped registers. Note: the CPU is only permitted to perfonn supervisor-mode data-type accesses of the 
PEP, i.e. cpujacode ^ 11, If the PCU is selected by the CPU and any other code is present on the 
cpu^acode bus the access is ignored by the PCU and the pcu^cpujyerr si^ial is strobed, 

CPU commands have priority over DRAM commands. When the PCU is executing each set of four com- 
mands retrieved from DRAM the CPU can access PCU-bus registers. In the case that DRAM commands 
are being executed and the CPU resets the CmdSource to zero, the contents of the DRAM CmdFifo is 
invalidated and no further commands from the fifo are executed. The CmdPending and NextBandCmdEn- 
able work registers are also cleared. 



The PCU can be programmed to associate microcode m DRAM with each finishedband signal. When a 
finishedband signal is asserted the PCU will read commands from DRAM and execute these commands. 
These commands are each 64-bits (sec Section 21 .8.5) and consist of 32-bit address bits and 32 data bits 
and allow PCU nuqjped registers to be programmed directly by the PCU. 

If more than ont finishedband signal is received at the same time, or others are received while microcode 
is already executmg, the PCU will hold the conunands as pending, and will execute them at the first oppor- 
tunity. 

Each microcode program associated with cdujiniskedhand, Ibdjinishedband and te Jinishedband would 
simply restart the appropriate unit with new addresses - a total of about 4 or 5 microcode instructions. As 
well, or alternatively, pcujinishedband can be used to set up all of the units and therefore involves many 
more instructions. This minimizes the time that a unit is idle in between bands. The pcu Jinishedband con- 
trol signal is issued once the specified combination of CDU, LBD and TE (programmed m BandSelect- 
Mask) have finished their processing for a band. 



Interrxipts are generated when the various page expansion units have finished a particular band of data 
from DRAM. The cdu Jinishedband, Ibdjinishedband and te Jinishedband signals are combined in the 
PCU into a single intemLpt pcu Jinishedband which is exported by the PCU to the interrupt controller. 

The PCU mapped registers should only be accessible from Supervisor Data Mode. The area of DRAM 
where PCU commands are stored should be a Supervisor Mode only DRAM area. Configuration register 
address legality is not enforced by the MMU i.e. the MMU does not check if the block address points to a 
valid PEP subsystem block. When the PCU is executing commands from CPU. any block-address decoded 
from a command which is not part of the PEP block-address map will cause the PCU to ignore the com- 
mand and strobe the pcu_invalid_address interrupt signal. The CPU can then interrogate the PCU to find 
the source of the illegal command. 

When the PCU is executing commands from DRAM, any address decoded from a command which is not 
part of the PEP address map will cause the PCU to: 

• Cease execution of current command and flush all remaining commands already retrieved from 
DRAM. 

• Clear CmdPending work-register, 

• Clear NextBandCmdEnable registers. 



21.4 



Page banding 



21 .5 Interrupts, address legality and security 
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• Set CmdSource to zero. 

In addition to cancelling all cuirent and pending DRAM accesses the PCU strobes the 
pcujnvalid^address interrupt signal. The CPU can then inteirogate the PCU to find the source of the ille- 
gal command. 

21.6 Debug Mode 

When the need to monitor the (possibly changing) value in any PEP configuration register the PCU may be 
placed in Debug Mode. This is done via the CPU setting certain Debug Address and Debug Enable regis- 
ters within the PCU. Once in Debug Mode the PCU continually performs read accesses of the target PEP 
configuration register (following the protocol detailed in Section 21.8.2) and sends the read value to the 
RDU. Debug Mode has the lowest priority of all PCU functions: if the CPU wishes to perform an access or 
there are DRAM commands to be executed they will interrupt the Debug access, and the PCU will resume 
Debug access once a CPU or DRAM command has completed. 
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21.7 Implementation 

21.7.1 Definitions of I/O 



Table 90. PCU Port List 





mt 


fm 




Clocks and Resets 


pdk 


1 


In 


SoPEC functional dock 


prsUn 


1 


In 


Active-low. synchronous reset in pclk domain 


End of Band Functional tty 


cxJu_finlsli ©dband 


1 


In 


Rnished band signal from COU 


tbd^finishedbacKj 


1 


In 


Finished band signal from LBD 


te.finlshedband 


1 


(n 


Rnished band signal from TE 


pcu_flnrshedband 


1 


Out 


Asserted once the specified combinatk>n of CDU, 
LBO, and TE have finished their processing for a 
band. 


PCU address error 


pcu_teu_addf ess J nvalid 


1 


Out 


Strobed if PCU decodes a non PEP address from commands 
retrieved from DRAM or CPU, 


CPU Subsystem Interface Signals 


cpu.adrt15:2J 


14 


In 


CPU address bus. 14 bits are required to decode the address 
space for the PEP. 


cpu_dataout[31:0] 


32 


In 


Shared write data bus from the CPU 


pcu_cpu_data[31 :0] 


32 


Out 


Read data bus to the CPU 


cpu^rwn 


1 


In 


Common read/not-write signal from the CPU 


cpu_acode[1:0} 


2 


In 


CPU Access Code signals. These decode as follows: 

00 • User program access 

01 - User data access 

1 0 > Supervisor program access 

1 1 - Supervisor data access 


cpu_pcu_sel 


1 


in 


Block select from the CPU. When cpu _pcu_sei is high both 
cpu_adrand cp£iLdarao(/tare valid 


pcu_cpu_rdy 


1 


Out 


Ready signal to the CPU, When f>cu_cpu_rdy\s high it indicates 
the last cycie of the access. For a write cycle this means 
cpu_da(aout has fc>een registered by the block and (or a read 
cyde this means the data on pcu_cpu_data is valid. 


pcu_cpu_berr 


1 


Out 


Bus error signal to the CPU Indicating an invaOd access. 


pcu.cpu_debugLval[d 


1 


Out. 


Debug Data valid on pcu_cpujdata tx/s. Active high. 


PCU Interface to PEP blocks 


pcu.adit11:2] 


10 


Out 


PCU address bus. The 1 0 least ^gnificant bits of cpu_adr[15:2] 
allow 1 024 32-bit word addressable locations per PEP block. 
Only the number of bits required to decode the address space 

are exported to each block. 


pcu_dataout(31 :0] 


32 


Out 


Shared write data bus from the PCU 


<unit>_pcu_dalain(31 :0) 


32 


in 


Read data bus from each PEP subbtock to the PCU 


pcu_rwn 


1 


Out 


Common read/not-wrtte signal from the PCU 
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Table 90. PCU Port List 







m 




pcu_<unit>.se1 


1 


Out 


Block select for each PEP block from the PCU. 
Decoded from the 4 most significant bits of cpu_adtt1S:2]. 
When pcu_<unit>_$el Is high both pcuLae/r and pcu^dataout 

are valid 


<unlt>j)cu_rdy 


1 


In 


Ready from each PEP block signal to the PCU, When 
<unit>^pcu^rdy is high it Indicates the last cycle of the access. 
For a write cyde this means pcu_dataout has been registered 
by the block and for a read cycle this means the data on 
<unit>,pcujsiataln Is valid. 


Dili Read Interface signals 


pcu_diu_rreq 


1 


Out 


PCU requests DRAM read. A read request must be accompa- 
nied t}y a vsilkJ read address. 


pcu_diu_radr(21 :5) 


17 


Out 


Read address to OIU 

17 bits wide (256^it aligned word). 


diu_pcu_rack 


1 


In 


Acknowtedge from DIU that read request has been accepted 
and new read address can be piaced on pcu^diu^radr 


diu.data[63:0] 


64 


In 


Data from OIU 10 PCU. 
Rrst 64-bit3 is bits 63:0 of 256 bit word 
Second 64-b(ts is bits 127:64 of 256 bit word 
Third 64-btts is bits 191:126 of 256 bit word 
Fourth 64-bits is bits 255:1 92 of 256 bit word 


diu_jicu_rvalid 


1 


In 


Signal from DIU telling PCU that vaiki read data Is on the 
diujdata bus 
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21.7.2 Configuration Registers 



Tabre91. PCU Configuration Regrsters 











Control reglsl 


ters 




0x00 


Reset 


1 


0x1 


A write to this register causes a reset of me PCU. 
This register can be read to Indicate the reset 
state: 

0 - reset in progress 

1 - reset not in progress 


0x04 


CmdAdr(21:5] 
(256'bit aligned DRAM 
address) 


17 


0x00 
000 


The address of ttie next set of commands to 
retrieve from DRAM. 

When this register is written to. either by the CPU 
ui uriMM oommana, i ts aiso wnnen to onTu- 
Source to cause the execution of the commands 
at the specified address. 


UXUO 


BandSelectMask[2:0] 


3 


0x0 


Selects which input finishedBand flags are to be 

watched to generate the combined fInishedAII- 

Band signal. 

Ditu - iDu^TiniSneaDand 

Biti - cdu_finlshedband 

6it2 - te.finlshedband. 


UXUU, 0X10, 

0x14. 0x18 


NextBandCmdAdf(3:0](21 :5] 
(256-btt aligned DRAM 
address) 


4x17 


0x00 
000 


The address to transfer to CmdAdr&s soon as 
pos8it)le after the next /&i/sAredBandfn7^gnal has 
been received as long as NextBandCmdEnabtefn] 
is set 

A write from the PCU to NextBandCmdAdrfn] with 
a non-zero value also sets NextBandCmdEna- 
ble[nj, A write from the PCU to NextBandCm- 
dAdffn] with a 0 value clears 
NextBandCmdEnabiefnl 


UX£U 


CmdSourcG 


1 


0x0 


0 - commands are talcen from the CPU 

1 - commands are taken from the CPU as well as 
DRAM at CmdAdr. 


0x24 


DebugSelect[15:2] 


14 


0x00 
00 


D9t>ug address select Indicates the address of 
the register to report on the pcu_cpu^data bus 
when ft is not otherwise being used, and the PEP 
txis is not being used 
Bits [15:12] select tite unit (see Table 92) 
Bits (1 1 :2] select the register within the unit 


Work reglsten 


t (read only) 


0x28 


lnvaridAddress{21U3J 
<64-bK aligned DRAM 
address) 


19 


0 


Address of iitegal 64-bit command in DRAM. 
Only valid when />C(/_/cu_a(/dras5.f/)V9//dhas 
been strobed. (64-bit aligned address) 


0x2C 


CmdPending 


4 


0 


R3r each bit: 

0 - no commands pending tor NextBandCmd[n] 

1 - commands pending for NextBandCmdAdiin] 
Read only register. 
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Table 91. PCU Configuration Registers 











0". v^]nSiiU4:?H' 


M 




[^^' 






0x34 


RnlshedSoFar 


3 


0x0 


The appropriate bft is set whenever the corre- 
sponding input finlshedBand flag is set and the 
corresponding bit in the BandSelectMasIc bit Is 
also set. 

if alt RnishedSoFar bits are set wherever Band- 
Select bits are also set all FintshedSoFar bits are 
cleared and the output finishedAnSand signal is 
given. 

Read oniy register. 


0x8 


NextBandCmdEnable 


4 


0x0 


This register can be written to indirectly (i.e. the 
bits are set or cleared via writes to NextBandCm- 
dAdffnJi 
For each bit: 

0 - do nothing at the next /J/i/s/redSandTn/ signal 

1 - Execute instructions at NextBandCmdAdrfn] 
as soon as possible after receipt of the next ftn- 
ishedBant^nJ signal. 
BitO*lbd^finishedband 

Biti - cdu^finishedband 
Bit2 - te.finlshedband 
Blt3-fini$hedAIIBand 
Read only register. 



21 .8 Detailed description 



21.8.1 PEP Blocks Register Map 

All PEP accesses arc 32-bit register accesses. 

Frotn Table 92 it can be seen that four bits only are necessary to address each of the sub-blocks within the 
PEP part of SoPEC. Up to 14 bits may be used to address any configurable 32-bit register within PEP. This 
gives scope for 1024 configurable registers per sub-block. This address will come either from the CPU or 
from a command stored in DRAM. The bus is assembled as follows: 

• adr[ 1 5^ 1 2] = sub-block address 

• adr[n:2] = 32-bit register address within sub-block, only the number of bits required to decode the reg- 
isters within each sub-block are used 



Table 92. PEP blocks Register Map 





jS^^c^^di^lg:'^ 


PCU 


0x0 


CDU 


0x1 


CPU 


0x2 


LSD 


0x3 


SFU 


0x4 


TE 


0x5 


TFU 


0x6 


HCU 


0x7 


ONC 


0x8 
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Tabre 92. PEP blocks Register Map 





^^^^^^^^ 


DWU 


0x9 


LLU 


OxA 


PHI 


OxB 


Reserved 


OxCtoOxF 



21.8.2 Internal PCU PEP protocol 

The PCU performs PEP configuration register accesses via a select signal, pcu_<block>_seL The read/ 
write sense of the access is communicated via the pcu_rwn signal (1 « read, 0 = write). Write data is 
clocked out, and read data clocked in upon receipt of the appropriate select-read/write-address combina- 
tion. 



Read 



polk 



rLTLH 



pcu_adr(13:2] per address 

pGu_fwn 



PEP address 



pcu_<block>_sel 
<block>_pcu_rdy 



1 



pcu_dataout[31:0] s^s^ PEP data 
pcu_<block>_sel 



<block>_pcu_rdy 



<block>_pcu_data[31:0] pep data 



Figure 96. PCU accesses to PEP registers 

Figure 96 shows a write operation followed by a read operation The read operation is shown with wait 
states while the PEP block returns the read data. 

For access to the PEP blocks a simple bus protocol is used. The PCU first determines which particular PEP 
block is being addressed so that the appropriate block select signal can be generated During a write access 
PCU write data is driven out with the address and block select signals in the first cycle of an access. The 
addressed PEP block responds by asserting its ready signal indicating that it has registered the write data 
and the access can complete. The write data bus is common to all PEP blocks, 

A read access is initiated by driving the address and select signals during the first cycle of an access. The 
addressed PEP block responds by placing the read data on its bus and asserting its ready signal to indicate 
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to the PCU that the read data is valid. Each block has a separate point-to-point data bus for read accesses to 
avoid the need for a tri-stateable bus. 

21.8.3 PCU DRAM access requirements 

The PCU can execute register programming commands stored in DRAM. These commands can be exe- 
cuted at the start of a print run to initialize all the registers of PER The PCU can also execute instructions 
at the start of a page, and between bands. In the inter-band time, it is critical to have the PCU operate as 
fast as possible. Therefore in the inter-page and inter-band time the PCU needs to get low latency access to 
DRAM. 

A typical band change requires on the order of 4 commands to restart each of the CDU, LBD, and TE, fol- 
lov*^ed by a single command to terminate the DRAM command stream. This is on the order of 5 commands 
per restart component. 

The PCU does single 256 bit reads from DRAM. Each PCU command is 64 bits so each 256 bit DRAM 
read can contain 4 PCU commands. The requested command is read from DRAM together with the next 3 
contiguous 64-bits which are cached to avoid unnecessary DRAM reads. Writing zero to CmdSource 
causes the PCU to flush commands and terminate program access from DRAM for that command stream. 
The PCU requires a 256-bit buffer to the 4 PCU commands read by each 256-bit DRAM access. When the 
buffer is empty the PCU can request DRAM access again. Adding a 256-bit double buffer would allow the 
next set of 4 commands to be fetched from DRAM while the current commands are being executed. 
1024 commands of 64 bits requires 8 kB of DRAM storage. 
Programs stored in DRAM are referred to as PCU Program Code, 

21.8.4 End of band unit 

The state machine is responsible for watching the various input xx Jiniskedband signals, setting the Fin- 
ishedSoFar flags, and outputting the Jinished^aiijband flags as specified by the BandSelect register. 

Each cycle, the end of band unit performs the following tasks: 

finishedAllBand = (FinishedSoFar [0] «= BandSelectMaskfO) ) AND 

(PloishedSoFar [1] s» BandSelectMask [ 1 } ) AND 
(FinishedSoFar(2] BandSelectMask[2 ) ) AND 

(BandSelectKask[0] OR BandSelectHask[ 1] OR BandSelect:Mask[2] ) 
if (finishedAllBand i== 1) then 

Fin IshedSoFar (0] ^ 0 

PinishedSoFar [ 1 ] = 0 

Fini8hedSoFar(2] = 0 
else 

FinishedSoFar(O) « FinishedSoFar [OJ OR (IbdLf inishedband AND BandSelectMasktOl) 
FinishedSpFartU = PinishedSoFar [1 J OR (cdu_f inishedband AND BandSelectMasktl] ) 
FinishedSoFar (2) « FinishedSoFar [2} OR (te_f inishedband AND BandSelectMaskC2) ) 

Note that it is the responsibility of the microcode at the start of printing a page to ensm-e that all 3 Fin- 
ishedSoFar bits are cleared. It is not necessary to clear them between bands since this happens automati- 
cally. 
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21.8.5 Executing commands from DRAM 

Registers in PEP can be programmed by means of simple 64-bit commands fetched from DRAM. The for- 
mat of the commands is given in Table 93. Register locations can have a data value of up to 32 bits. Com- 
mands are PEP register write commands only. 

Table 93. Register write commands In PEP 



Register write 



data 



zero 



32-bIt word 
address 



Due attention must be paid to the endianness of the processor. The LEON processor is a big-endian pro- 
cessor (bit 7 is the most significant bit). 



21.8.6 General Operation 



Upon a.Reset condition, CmdSource is cleared (to 0), which means that all commands are initially souiced 
only from the CPU bus interface. Registers and can then be written to or read from one location at a time 
via the CPU bus inter&ce. 

If CmdSource is 1 , commands are sourced from the DRAM at CmdAdr or from the CPU bus. Writing an 
address to CmdAdr automatically sets CmdSource to 1 . and causes a command stream to be retrieved from 
DRAM. The PCU will execute commands from the CPU or from the DRAM command stream, giving 
higher priority to the CPU always. 

Regardless of the state of CmdSource the DRAM requestor is examines the CmdPending bits to determine 
ifa new DRAM command stream is pending. If any of CmdPending bits are set, then the appropriate Next- 
BandCmdAdr is copied to CmdAdr (causing CmdSource to get set to 1) and a new command DRAM 
stream is retrieved from DRAM and executed by the PCU. Note that a new DRAM command stream only 
gets retrieved when the current command stream is empty. 

If there are no DRAM commands pending, and no CPU commands the PCU defaults to an idle state. 
When idle the PCU address bus defaults to the DebugSelect register value (bits 1 1 to 2 in particular) and 
the default unit PCU data bus is reflected to the CPU data bus. The default unit is determined by the 
DebugSelect register bits 1 5 to 12. 

In conjunction with this, upon receipt of ^finishedBandfnJ signal, NextBandCmdEnablefn] is copied to 
CmdPehdingfnJ and NextBaruiCmdEnable[n] is cleared. Note, each of the LBD, CDU, and TE (where 
present) may be re-progranuned individually between bands by appropriately setting NextBandCmdAdr[2- 
0] respectively. However, execution of inter-band commands may be postponed until all blocks specified 
in the BandSelectMask register have pulsed their finishedband signal. This may be accompUshed by only 
setting NextBandCmdAdr[3} (indirectly causing NextBandCmdEnable[3] to be set) in which case it is the 
finishedAUBand signal which causes NextBandCmdEnablefSJ to be copied to CmdPending f3J. 

To conveniently update multiple registers, for example at the start of printing a page, a series of Write Reg- 
ister conunands can be stored in DRAM. When the start address of the first Write Register command is 
written to the CmdAdr register (via the CPU), the CmdSource register is automatically set to 1 to actually 
start the execution at CmdAdr, 

The final instruction in the command block stored in DRAM must be a register write of 0 to CmdSource so 
that no more commands are read from DRAM. Subsequent commands will come from pending programs 
or can be sent via the CPU bus interface. 
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21,6.6.1 Debug Mode 



Debug mode is implemented by reusing the noraial CPU and DRAM access decode logic. When in the 
Arbitrate state (see state machine A below), the PEP address bus is defaulted to the value in the DebugSe- 
lect register. The top bits of the DebugSelect register are used to decode a select to a PEP unit and the 
remaining bits are reflected on the PEP address bus. The selected units read data bus is reflected on the 
pcujcpujdata bus to the RDU in the CPU. The pcujcpujdebug^valid signal indicates to the RDU that the 
data on the pcujcpujdata bus is valid debug data. The pcu_cpujiebug_jfalid is a repeated version of the 
selected units ready signal <unit> ^cujrdy. 

Normal CPU and DRAM command access will require the PEP bus» and as such will cause the debug data 
to be invalid during the access, this is indicated to the RDU by setting pcujcpu_d€bug_yalid to zero. 

The decode logic is : 

// Default Debug decode 

pcu_<unit>_sel s decode (DebugSelect {15: 12] ) 

pcu^adr [11:2] « DebugSelect [11 : 2] 

pcujcpujdata = <unit>_^Uwdataln(31:0] 

pcu^cpu_debug_val id = <unit>_pcu_rdy AND state =b Arbitrate 



DRAM command fetching and general command execution is accomplished using two state machines. 
State machine A evaluates whether a CPU or DRAM command is being executed, and proceeds to execute 
the command(s). Since the CPU has priority over the DRAM it is pennitted to interrupt the execution of a 



Machine B decides which address should be used for DRAM access, fetches commands from DRAM and 
fills a command fifo which A executes. The reason for separating the two functions is to facilitate the exe- 
cution of CPU or Debug commands while state machine B is perfonning DRAM reads and filling the 
conunand fifo. In the case where state machine A is ready to execute commands (in its Arbitrate state) and 
it sees both a full DRAM command iifo and an active cpu_j}cu_sel then the DRAM commands are exe- 
cuted. 



21,8.7 



State Machines 



I 



stream of DRAM commands. 
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I 21.8.7.1 State Machine A: Arbitration and execution of commands 

State Machine A 



Dcu softraset n«»0ORprst n = 0 




<unit> ecu ftH/— t 
pcu.cpu.rdy • 1 

pcu.cpu.data « <unit>..pcu.daia 



"^RAMAcces^ (cpuAccess)^ 



cmd fiouTg»=-l AMP 
cmri fiiofcfnrt ro«jf«te»>RFfiPHVPn 
Ne)ctBandCn>dAdi(3:0] -O 
cmd.fifo^fulM) 
cmdJilOiO 




CPU ftdP-^ESEBlflaa 



AdrError J PCtiJcuJrtvaW.addro8S-l 



Figure 97. Command Arbitration and execution 

The state-machine enters the Reset state when there is an active strobe on either the reset pin^ prst_n, or the 
PCU's soft-reset register Ail registers in the PCU are zeroed, unless otherwise specilied, on the next rising 
clock edge. The PCU self-deasserts the soft reset in the pc/ik cycle after it has been asserted. 

The state changes from Reset to Arbitrate when prst_n = 1 and PCUjsoftreset = 1 . 

The state-machine waits in the Arbitrate state until it detects a request for CPU access to the PEP units 
(cpu_pcu_sel = 1 and cpu_acode = 1 1) or a request to execute DRAM commands CmdSource = 1, and 
DRAM commands are available. CmdFifoFull^X . Note if (c/7w ^cu^el = 1 and cpu_pLcode != 11) the 
CPU is attempting an illegal access. The PCU ignores this command and strobes the cpu^cujberr for one 
cycle- 

While in the Arbitrate state the machine assigns the DebugSelect register to the PCU unit decode logic and 
the remaining bits to the PEP address bus. When in this state the debug data returned from the selected 
PEP unit is reflected on the CPU bus (pcu_cpu_daia bus) and the pcu_cpu_jdebug_valid^\ . 

If a CPU access request is detected (cpu ^cu_sel = 1 and cpu_acode =11) then the machine proceeds 
to the CpuAccess state. In the CpuAccess state the cpu address is decoded and used to determine the PEP 
unit to select. The remaining address bits are passed through to the PEP address bus. The machine remains 
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in the CpuAccess state until a valid ready from the selected PEP unit is received. When received the 
machine returns to the arbitrate state, and the ready signal to the CPU is pulsed. 
// decode the logic 

pciJt_<uni t>_sel « decode ( cpu.adr [ 15 ; 12 ] ) 
pcuu«dr(ll:2] = cpu_adr [ 11 : 2 J 

If when decoding the cpu_adr bus, the address selects a reserved address, the state machine proceeds to 
the AdrError state, and then back to the Arbitrate state. An address error interrupt will be generated. 

If the state machine detects a request to execute DRAM commands (CmdSource = 1), it will wait in the 
Arbitrate state imtil commands have been loaded into the command FIFO from DRAM (all controlled by 
state machine B). When the DRAM commands are available {cmd Jifo JuU = 1) the state machine will 
proceed to the DRAMAccess state* 

When in the DRAMAccess state the commands are executed from the cmd Jifo. A command in the 
cmdjifo consists of 64-bits (or which the FIFO holds 4). The decoding of the 64-bits to commands is 
given in Table 93. For each command the decode is 
// DRAM command decode 

pcu_<unit>_sel = decode < ocndL£i£o[cnid.countHl5 : 12] > 
pcu_adr 1 11 : 2 J = cind_f if o(cxnd_count) (11:2] 
pcu_dataout = cmd_f if oCcxnd^countHSS : 32] 

When the selected PEP unit returns a ready signal {<unit>^>cu_rdy=\) indicating the command has 
completed, the state machine will return to the Arbitrate state. If more commands exists (cmd^count !=0) 
the transition will decrement the command count. 

When in the DRAMAccess state, if when decoding the DRAM command address bus 
(cmdJi/ofcmd_countJfJ5:I2J), the address selects a reserved address, the state machine proceeds to the 
AdrError state, and then back to the Arbitrate state. An address error interrupt will be generated and the 
DRAM command FIFOs will be cleared. 

A CPU access can pre-empt any pending DRAM commands. After each command is completed the state 
machine returns to the Arbitrate state. If a CPU and DRAM command are pending the CPU command 
always takes priority. If a CPU or DRAM command sets the CmdSource to 0, all subsequent DRAM com- 
mands in the command FIFO are cleared. If the CPU sets the CmdSource to 0 the CmdPending and Next- 
BandCmdEnable woxk registers are also cleared. 
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21.6.7.2 State Machine B: Fetching DRAM commands 



State Machine B 



C 



Reset 



pcu gQftreaet n °° ! 



c 



3 



Wait 



sourc»>«0 AND cmd fifo fmi«rf> 



_^ nnd pfinfling 1=0 

d_5OU/C0 ■ 1 

pcu^dhj^rreqal 

emd^r - N«aeandCmdAdrtpaxding) 
peujtjkjjradracmcLadr 



> 



and fiQunee—l AND cmd fifn f.m, 

pcu_<jHj_froq-i 

pcujaiu.ia0n<md.jadr 



FIllFtfo 



3 



diu DCU fVaHtlsal 

cmO^otOMnTdaia 



^ Datal ^ 



i_wo(ij-djo_( 



Data2 ^ 



^ Pata3 ^ 



cmdjifcX3j«dlu_data 
cmd_lito_hjll-l 
cmd^courrt - 3 



Figure 98. DRAM command access state machine 



A system reset (prst_n=0) or a software reset (pcuj5oftreset_n=0) will cause the state machine to reset 
to the Reset state. The state machine remains in the Reset until both reset conditions are removed. When 
removed the machine proceeds to the iVait state. 

The state machine waits in the Wait state until it determines that commands are needed from DRAM. Two 
possible conditions exist that require DRAM access. Either the PCU is processing commands which must 
be fetched from DRAM {cmd_source~l), and the command FIFO is empty {cmd Jifo Jull-^\ or the 
command FIFO is empty and there are some commands pending (cmd pending !=0). In either of these 
conditions the machine proceeds to the FillFifo state and issues a read request to DRAM 
(pcu_diu_rreq=l), it calculates the address to read from dependent on the transition condition. In the 
command pending transition condition, the highest priority NextBandCmdAdr that is pending is used for 
the read address (pcu_diu_radr) and is also copied to the CmdAdr register. In the normal PCU processing 
transition the pcu_diu_radr is the CmdAdr register. 
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In the FillFifo state the machine wait for the DRAM to respond to the read request and transfer data words. 
On receipt of the first word of data diu^cujrvalid=l, the machine stores the 64-bit data word in the 
command FIFO (cmdJifofOJ) and transitions to the DataJ, Data2^ DataB states each time waiting for a 
diu_j>cu_rvalid=l and storing the transferred data word to cmdJifo[l], cmdJifo[2] and cmdjifop] 
respectively. 

When the transfer is complete the machine returns to the Wait state, setting the cmdjzount to 3 and the 
cmdJifoJull=\ . 



When the PCU is executing commands, addresses decoded from commands which are not PCU mapped 
addresses (4-bits only) will result in the current command being ignored and the pcujnva!id_address 
interrupt signal is strobed If this command is from DRAM all remaining commands already retrieved 
from DRAM are flushed from the CmdFifo, CmdPending, NextBandCntdEnable and CmdSource are 
cleared to zero.) These registers are uneffected if the command is from the CPU. The CPU can then inter- 
rogate the PCU to find the soxirce of the illegal command via the IrwalidAddress register. 



21,8.7.3 PCU JCU_AddressJnvaNd interrupt 
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22 Contone Decoder Unit (CDU) 



22.1 



Overview 



The Contone Decoder Unit (CDU) is responsible for performing the optional decompression of the con- 
tone data layer. 

The input to the CDU is up to 4 planes of compressed contone data in JPEG interleaved fonnat. This will 
typically be 3 planes, representing a CMY contone image, or 4 planes representing a CMYK contone 
image. The CDU must support a page of A4 length (1 1 .7 inches) and Letter width (8.5 inches) at a resolu- 
tion of 267 ppi in 4 colors and a print speed of 1 side per 2 seconds. 

The CDU and the other page expansion units support the notion of page banding. A compressed page is 
divided into one or more bands, with a number of bands stored in memory. As a band of the page is con- 
sumed for printing a new band can be downloaded. The new band may be for the current page or the next 
page. Band-finish interrupts have been provided to notify the CPU of free buffer space. 

The compressed contone data is read from the on-chip DRAM. The output of the CDU is the decom- 
pressed contone data, separated into planes. The decompressed contone image is written to a circular 
buffer in DRAM with an expected minimum size of 12 lines and a configurable maximtmi. The decom- 
pressed contone image is subsequently read a line at a time by the CPU. optionally color converted, scaled 
up to 1600 ppi and then passed on to the HCU for the next stage in the printing pipeline. The CDU also 
outputs a cdujinishedband control flag indicating that the CDU has finished reading a band of com- 
pressed contone data in DRAM and that area of DRAM is now free. This flag is used by the PCU and is 
available as an interrupt to the CPU. 



22.2 Storage requirements for decompressed contone data in DRAM 



A single SoPEC must support a page of A4 length (1 1.7 inches) and Letter width (8.5 inches) at a resolu- 
tion of 267 ppi in 4 colors and a print speed of 1 side per 2 seconds. The printheads specified in the Bi- 
lithic Printhead Specification [2] have 13824 nozzles per color to provide full bleed printing for A4 and 
Letter. At 267 ppi, there are 2304 contone pixels^ per line represented by 288 JPEG blocks per color. How- ' 
ever each of these blocks actually stores data for 8 lines, since a single JPEG block is 8 x 8 pixels. The 
CDU produces contone data for 8 lines in parallel, while the HCU processes data linearly across a line on 
a line by line basis. The contone data is decoded only once and then buffered in DRAM. This means we 
reqxiire two sets of 8 buffer-lines - one set of 8 buffer lines is being consumed by the CPU while the other 
set of 8 buffer lines is being generated by the CDU. 

The buffer requirement can be reduced by using a 1.5 buffering scheme, where the CDU fills 8 lines while 
the CPU consumes 4 lines. The buffer space required is a minimum of 12 line stores per color, for a total 
space of 108 KBytes^. A circular buffer scheme is employed whereby the CDU may only begin to write a 
line of JPEG blocks (equals 8 lines of contone data) when there are 8-lines free in the buffer. Once the full 
8 lines have been written by the CDU, the CPU may now begin to read them on a line by line basis. 

This reduction in buffering comes with the cost of an increased peak bandwidth requirement for the CDU 
write access to DRAM. The CDU must be able to write the decompressed contone at twice the rate at 
which the CPU reads the data. To zdlow for trade-oflFs to be made between peak bandwidth and amount of 
storage, the size of the circular buffer is configurable. For example, if the circular buffer is configured to be 
16 lines it behaves like a double-buffer scheme where the peak bandwidth requirements of the CDU and 



1. Pixels may be 8, 16, 24 or 32 bits depending on the number of color planes (8-bits per color) 

2. 12 lines x 4 colors x 2304 bytes (assumes 267 ppi. 4 color, full bleed A4/Lcttcr) 
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decompressed contone data (buffer) 




a. Requifed for CFU to convert to final output at 1600 dpi 

b. Bi-lithicprinthcad has 13824 nozzles per color p^dmg full bleed printing for A4/Le«er 

c. Bi-«*ic printhead has 19488 no2rie,percolorp«>vidin8 fun bleed printing for A3 

d. 12 lines x 4 colors x 2304 bytes. 

22.3 Decompression performance requirements 

tem clock cycle to achieve aTrint speS 0^ ^^^^ 2 secTnS for fi n L h ^1"^" ^. 'y'' 
repUcatcs pixels a scale factor CSF) number ofti^Z H^^t u f A4/Letter printing. The CFU 

vert the fi,^ output to le^pi TLr^he ^nl ^'^^ '° 

The 1.5 buffering scheme St^^secrio^^^ ',1'^°'" P"L^' bits) every SF x SF cycles, 

twice this rate. With support STcSorS^^ 

1-78 bits/cycle*. '^'^ • decompression output bandwidth requirement is 

The JPEG decoder is fed directly from the main memorv via thi. nuAKA i^*^ 

pression detennines the input bandwidth r^^mTmf^th^^v f^ T^T 

the bandwidth decn««es. but the quality of thSTutout i^JS^', , i compression increases. 

compression ratio for contone dL is ^peSed to be7o 7^ k "7"!^ ^'^^^^'^ ^'^ 

allows for a local minimum ^n^pr^il^^Tof^^o'c^'^^^^^^ 

stir^e?rrni~^^°^'*^^^'*^-^^^ 

^.urr^ad;uteU-t^^^^^^^^ 



1. 2 X ( (4 colors x 8 bits) /(6 x 6 cycles) ) = 1.78 bits/cycle 

2. 2 X ( (4 colors x 8 bits) / (4 x 4 cycles) ) = 4 bits/cycle 
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u^l^I^T"'^ "qulremenls for lull blM A«LMlcr 



printing at 1 side per 2 seconds. 



267 


6 


1.78 




4 


4 


600 


2 





cuiu a I X UJ 

b. Scale fictor 2 requires at least a 16 line buffer. 



22.4 Data flow 



ime buffets are subsequently read by the CFU ^ ^ ""^^"'^ ''"^'^ ^^RAM- The 



I 

I «DRAM 
I 



compressed 
contoxie 
pTanas 



^decompressed oomone 
line store buffer 
(minimum 12 Unes) 




Figure 99. Outline of contone data flow with respect to CDO 

bTe^^^t^i^rr^^Hr^^: r-' r 

and K. direcUy represented by CMY^ ^ 'SJ^ ^1 ^ <=' M. Y. 

muIti-SoPEC printing with exact colore. " '^P'*^'''^* 8°"' 8«en etc. 

n"^:;Ka:^!S^t™-~-^^^ 

contain luminance infonnation and srwm:i?n^H^ considered to be luminance, but C. M. and Y each 
We therefore provide the me^" bTwh^rci^ L compressed with appropriate luminance tables 
conversion. When being JPE^comJreief Si??s t^S ^""^^^ ^ ^o^o^ 
finally JPEG compressed. At decon^Sn t5f YJSi'^?r"''L'° '^^'^ ""'^ 
contone store by the CDU. This is r^lTihTc^V^l^ul t^^'^ and wntt,^ to the decompressed 
verted to RGB, and finaUy back to CMY ^'^'^ "° optionally color con- 
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S5 



a« nonnalized to occupy all 256 levels of an 8-b« biSy en^odS ^' ^ '^'^ ^ 

The CFU provides the translation to either RGB or CMY ROR ,c inri..w-^ • 
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22.5 Implementation 

A block diagram of the CDU is shown in Figure 100. 



DRAM rnterface Unit 



64 



end_oLband 



compressed 
contone 
FIFO 



itf.en. wr an 



3^ fifo_wr_adr 



reacf 
csontrof 
unit 



X 



JPEG 
decoder 



iPg-tfec_status 



phceLouLvaBd 



pfateLout 



Mi 







4k. it 




8 ^ 














o 




1 




■J 




f 




I 








tss 

5 







^^2 /U^ 30 



17 



configuration 
registers 



num_buff Bnas 



Contone Decoder Unit 



6 y 



32 



TJ 
a' 

"8 



64 



write 
control 
unit 



contone 
line 
store 
interface 



half-btock 
buffer 

interface 



32 



PEP Controffer Unit 



Clock. Power 
Reset 



Contone Frfo Unit 



Figure 100. Block diagram of CDU 
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J3 



A!l output signals from the CDU (cdu cfu wradvfiUn^ ^ , i. . 
Signals to the DRJ) n.ust alwlys f ^J^^r '^t^^^^ 

c^u.c>.H^.v.//;..c^.^^^^ not currently decoding. 

The read control unit is responsible for keepine the JPEO H^<vwi . • 

pressed contone bytestream from external dSS ^ Dm ' ^^^^^ f ^> ^-^"fi com- 

The write control unit accepts the output ^^e ipc^^ ^J'i^"""' cdu Jinishedband signal 
^ntes it into adouble-b^, and ^t^l^^^^^^ ^ytes) at a SLe, 

DIU. mtemcting with the CFU in order to shai^ DRmTSk ^ half blocks to DRAM via the 



22.5.1 Definitions of I/O 



port fist and description 



Clocks and reset 



jctk 



JcJk_enaWe 



|rsl_n 



pcu_cdu_sel 



Pcu-adrr7:2j 



pcu_dataout[3 1 ;0 j 



cdLr,j>cu,data(31:0] 
_PIU read Interface 



diu.cdu^iack 



diu^dataf63:0 ] 
DIU write Interface 



cdu_diu_wfeq 



<liu_cdu_wack 



Doc: SoPEChardware^design 
Version: 2.3 



J" System dock. 



In 



Out 



In 



fn 



Gated version of system dock used to dock the jppg ho^-. 



32 



In 



fn 



tn 



Out 



32 



Out 



Block select from the PCU. When pcu cdu 50/ fs hiah both 
pca^adrand f>cu_<iataoutfire valid ^ 



Common read/not-write signal from the PCU. 



PCU address bus. Only 6 bits are required to dec^dTST 
address space for this block. ««»ae me 



Shared write data bus from the PCU. 



^elastc^dJS^!^ 

m/^^^ h J h ^"^^^^ * write.cyde this means 
^itT ^^^'^ registered by the block and for a read 
cyde this means the data on cdu^ data is v^nn 



Read data bus to the PCU. 



Out 



fn 



In 



request has been accepted and the new read address can be 
placed on the address bus. afu_diu_,adr. "■™'**'<*"«« 



CPU read addres s. 17 bits wide ( 256-bit alton«H 

Dn**.^ ^-A^ . .. . ' 1_ 



^wl^^."^^'^.'!"^'^^ '^'^^^^ read data Is " 

now on the read data bus, dfu_data. 



Out 



In 
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^'''g SS^DU port list and description 



cdu_diu_wad42l :3J 




odu_cftj_wradv8«ne 



TE and LBD Interface 



Out 



Read Hne pulse, active high. Indicates that the cr i h«« i;«j»k J7 
buffer In DRAM and that line of ftp buffer io now free 



cdu_start_of_bandstore[21 :5j 



cdu_end_oLbandstore[2l :5] 



ICQ Interfac e 
cdu_finishedt)and 



17 



17 



Out 



Out 



PWnts to the 256-bit word that defines the start of the memnrv 
area allocated tbr page bands. memory 



^ to the 256.bit word that defines the last address of the 
memory area allocated far page bands. " or me 



odujcujpegerror 



L 



Out 



orS . ""'""^ processing a band of com- 

pressed contone data in DRAM and that area of ORAM Isnow 
fr^This sionai goes to t«th the internip. co„,?,!„erZ' 



- 



i/r f !?^?s^ ^decompression has stopped A 

reset of the CPU must be peribmied to clear this inS ^ 
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22.5.2 



Configuration registers 

CDU. Note that since Jdresses in SoSc^ljlISe?'^ p^Ar '.'"^ ^^'"S *e 
and writes, the lower 2 bits of the PCU ad^S?^ fl"^ T *^ '^^^ ""'^ '"PP*"** ^^-bit register reads 




0x84 



HndOfBandStore 



0x0,0000 



17 



Ox1_FFFF 



Hie CDU contains the following additional registers; 



Wnte to the 256-bit v«>id that defines the start of Ihe" 
memory area aHocated tor page bands 
Ctojfar address generation wraps to this start 



fWnts to the ase^it word that defines the last 

ban^ ®* memory area aWocated for page 

Inl^^TIi?^'' ^^^'^ ^*^^«ss. then 

nstead of addrng 1 to the current address, the ^r 

Si^^Te^'' ^ '-^^ ^« ^^-OfBand. 



Table 98. CDU registers 




0x04 



^etup regi sters 
0x10 



Go 



A wme to this register causes a reset of the CDU 
?^fti1r!rf f operations within the ' 

CS61 50. All confiflufation data previously loaded into 
tf>e core except tor the tables is deleted. 



Writing 1 to this register starts the COU. Writing O to 
this register halts the COU. 
When Go is deasserted the state^chines go to 
iheir Kiie slates but all counters and configuration rea- 
Jsters keep their values. u"«»wn reg 

When Go is asserted aJi counters are reset but con. 
figuratwn registers keep their values (i.e. they don^ 
gat reset), /Vex/SantfSna6te is cleared when Go fe 
asserted. 

The CPU must be started before the COU is started 
This register can be read to determine if the CDU Is' 
running (i - running. 0 - stopped) 



MaxPiane 



Defines the number of contone pfanes - 1 ' 

iST^'i!; ^ ^ (Oreyscale printing). 2 

for CMY, and 3 for CMYK. ^' 
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Table 98. CDU registers 



0x14 



0x1 C 



MaxBlock 



13 



15 



BuffEndAcfr 



0x24 



0x30 



0x34 



0x36 



0x000 



0x0000 



U. 6x8 bytes) in a line - 1. H"'vajenis. 



Points to me start of the decompressed contone 
Sind^^^' • aligned toa haff JPEG bto^ 
A half JPEG Week consists of 4 words of 256-bfts 
aTS "^"'^"^^ '^^^^ ^ "^'^^^ ^^^^ 



Points to the start of the fast half JPEG block at the 
^QA^ « deoonnpfessed contone circular buffer in 
^^.^^InJSI'?? ^ * "«* boundary. 

A half JPEG block consists of 4 words of 256-blts 

rjPEGSor'^'^''^''^'''^*"'^^^^^^ 



Bypassjpg 



NextBandCurr> 
SourceAdr 



Defines size of buffer in DRAJW m terms of the 

JJirJ^SL*"^ clecompressed contone Unes. The size of 
msm size of 8 hnes. . 



0x0 



NextBandEnd- 
SourceAdr 



19 



0x3C 



NextBandVaHd- 
BytesLastFetch 



NextBandEnable 



Read-only registers 



^.TH^f °' ^® J'*^^ ««ecoder %vill be " 

bypassed (and hence pixels are copied directly from 
input to output) ^ 
0 - don't bypass, 1 - bypass 
Should not be changed between bands. 



^ll'^T'^^ """^'^^ containing the start" 
DaSL compressed contone data In 

Tliis v^ue is copied to Ct//rSou/t»>^tfrwhen both 
^neeandis 1 and NaxtBantfEnatfeis 1. or when 
Go transitions from 0 to 1. 



0x0,0000 



The 64-blt aligned word address containing the last " 
b5gesMof the next band of compressed contone data in 

TWs ^ue Is copied to EndSo*yrce/ltfrwhen when 

bothOo/wfiantfisland/VexfSa/k/E/iadfeisI or 
J^vnen Go transitions from 0 to 1 . 



Mask containing a 1 in each bit position that repre- 
sente a valid byte in tha last 64-bil fetch of the next 
oand of compressed contone data from DRAM 
This ^^ue is copied to ValidBytesUstFetch when 
both DoneBand is 1 and NextBandEnabte is 1 or 
when Go transitions from O to 1 , 



When NextBandEnaUels 1 and DoneSandis 1. then " 
wlien cdu^finlshedbandls set at the end of a band 
-NextBandCurrSourceAdr\% copied to Cu/r- 
SourceAdr, 

'Next^nd£ndSourceAdr \s copied to EndSourceAdr 

'NextBandVaifdBytesLastFetch is copied to Valid- 
BytesLastFetch 

'DoneBand is cleared, 

'NextBandEnab/e is cleared. 

NextBandEnabie is deared when Go is asserted. 
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Table 98. COU registers 



DoneBand 



0x0 



CurrSourceAcfr 



EndSourceAdr 



VatidBytesLast- 
Fetch 



JPEG decoder core setup leglsters 




0x0^0000 



^^^^^^^^^^^^^^^^^^^ 

tehed loading Into the local RFO. It Iscfear^ to 0 
when Go iransfUons from 0 to 1 

^^i^rlf «^P^^^«<i contonedata for the 

^^S^^T^''^'^' 9^" «^ 
OQfTaBand flag Is set 

dSX^,T next band and 

OOTOSs/idtecteared. Processing of tt,e next band 
«afl8 rmmediately. 

ri^If^^'"'"^ " remainder of the 

J^yJ^ded White the read control unit vJs far 
NaxtBanaEnaUe to be set before n restarts. 



OxO_0000 



The current 256-brt aligned word address within the" 
current band of compressed contone dataroarM 



^L^tll ^'^'^ containing the lasi 

Sr.n^'SS:'""* «--of compressed «.nfa„e 



Mask containing a 1 in each bH position that reore." 
fi^«ci =^ . are vaw. then the lower 3 bits of Valid- 



JpgDecMask 



JpgOecTType 



JpgOecTestEn 



JpgOecPType 



As segments are decoded they can also be output on 
the DB<Upg (JpgOecHds) port with the user sheeting 

J^stTrtrfS^^^ ^ -^'^^ ^^iP^^- 

4 SOF+SOS+DNL 

3 COM^APP 

2DRf 

1 OQT 

ODHT 

lTio!'!^J^At'''^l^^^''^^^'' '^^^ then 



:!g£g^ecoaer core readK>nly status rea^at^ri 



Test type sefector ~ — " 

? ' SS^^^®*^!:"*^ dfepJayed on JpgDecTdata 
1 > QDCT coefficient displayed on JpgDecTdata 



Slgn^ wnrcn causes the memories to be t>ypassed" 
tor test purposes. 



!n^"/i''£^'^^"^ parameters to be placed on port" 
Jp9DecP\/^lue (See Table 99V ^ 



JpgOecHdr 



Selected header segments from the JPEG stream " 
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Table 98, COU registers 




0x64 



0x68 



0x6C 



JpgDecTData 



JpgOecPValue 



JpgOecStatus 



13 



16 



22 



0x0000 



0x0000 



OxOO.CXXX) 



pot byte of the first 8x8 block of the test data 

.'Jf "^^"^ ''^'^^^s the first out. 

put byte of each 8x8 block of test data 
11-0- 1 1-bit output test data port . displays OCT 



Decoding paiameter bus which enables varloul 

parameters used by the we to be read. The data 
av^lable oo the PValue port Is tar Information only, 
and does not contain control signals tor the decoder 



22.S.3 



Bit 21 -JpsLcoiB.stamH set, indicates that the JPEG 

SS.^*!*^ '^'^^ JPEG 
halftlock douWe-buffers of the COU are full) 

TJ^c^J^'^'^"' an oiit;irt from 

».e JPB3 decoder core aiKl is asserted when a pTxel 
Is being outptit 

Bits 19-16 . m^contents (RFO at inpiit of JPEG 
decoder core) 

CS61S0 (see Table 100 for description of bite). 



Typical operation 

The CDU should only be started after the CFU has been started 

/Lines. Users then set the CDU's Go hi^'^^r^ ■ ^V J"^'^' ^"MndBlockAdr and NumBuf- 
for the band has finishedS^g rS i^ ^SSS^'^^^ When the compressed contone da4 
indicating that the memory assoc^^i* tjf&^rbi^^^^ ""T^^ '""^ ""^ '° CPU 
band of contone data. ^ """^ P'o««ing can now start on the next 

for restarting the CDU bet«4en banS ^ *° NextBandEnable. There are 4 mechanisms 

d.This is a combination of 6 and c above TIir x>ni t *i. . . 




registers and sets the NextBandEnable bit before the «,h „f .k 

current band the CDU sets Z)o«eW smd mli cl^ , .^.'T"' """^ '^"'^ 
already 1 . the CDU starts pr^Ti^J^Z.f^J'"'^'^'^ ^ N^tBandEnable is 
cdujinishedband tnsRj^^U^f^^ "inmediately. Simultaneously, 
rcs^ «y the ^i»"SfeTcu 1"^^^^^^^^^^^^ have 
gnunthcCDU'snextbandshadowregistersandseJSS^S^^^^^^^ 
Ifan error occurs in the JPEG stream, the JPEG decoder will sucnPnH .>c 

m the JpgDecStatus register and the core will igno^^v i^^lTf operation, an error bit will be set 
ing again. An interrupt is sent to the CPU by^S ^Tf^ ' '^"^ d«=«<»- 

reset by nreansofawrite to its agister LSS^ew^^r^^^ ^'^^ 

22.5.4 Read control unit 

by means of the state machine described in nj« fo? implemented 
All counters and flags should be cleared after reset \Wh^ ,1 ^ 

Should take their initial value. AVhile the^ bhS seTSrff. v ' ^' ^"""^^"^ ^^8^ 

it whether to attempt to ,«ad abandof c^™r«s^Toi^t«'^ "^^^^^^ *^ i^oneWbit to tell 
docs nothing. When DoneBandis clL tS^sTte m^^h^ ^ DoneBandis set. the state machine 

up to 256-bits at a time while Sel^' ^ a^^^XS^c^r^^^ 

knowledge about numbers of blocks or nimibers of color ^l«n!c > ? , ^^'^ has no 

by consecutive reads from DRAM. The SJuTrL^i^Lftr " ^^^0 input FIFO full 

..leastatthepeakDRAMreadbandJcS^rO^^^^^^^^ 

Iremid wSerrS^r.? i^r^^Af ^^^.^^^^ ^ « ^^^^it read access. It is 
diu_cdu_rvalid being as^S" cut ^oJ^!^, ^ i°di<=ated by 

end_of_bandstorei ' '^"^«^'«*-'^ « compared to both end_source_adr and 

• ™7l^o":Sut^^rof^^^^^^^^ control sign^ sent to the 

« se^^^e remaining .4-bit va.es in th^ hJrmt^jl^S^rK^^ 

■ I'^J^r^LTl^^J'^-^^^^^^ cioes not e,ual end_.ou..e_adr, then 

whether c«,r^o«nJWr ^JsTeSLSl *L curr_source_adr + 1. depending on 

FIFO is 0. - ^'^'^'*"''-°>^**''^''''*-T^'^«'^-«/-W control 

a/fr^ot»ce_a<&- is output to the DIU as c</w_<//u_^a^f>: 

A count is kept of the number of 64-bit values in the FTFr^ wt,- j 

0. data is written to the FIFO by assert^ W^^d ? « I and ignore_data is 

incremented. asscrang /•i/oW'r. aaAfifo_contents[3:0] aaAfifo_wr_adr[2:0] are both 

rnf^rtSS'iir :re^^^^ ^^^-^ is data available in 

data from the FIFO. Note it is also poSblf to b^^^^^^^ - -^''^ »«> -=cive 

'Ster to 1 . In this case data is sent Lctly fromElTO to t^e lT„^^^ \TTV^^ 

decoder is riot stalled {jpg core stall equal oTand rj?„ !^ / double-buffer. While the JPEG 

a byte of data is consSfd by tife JpS deL« cSf ^/t^V^^^^^ andy>^_/._.rr6 are both 1. 

ne.byte.^ereadaddressisVea..Jrn^T^^1^^^^^ 
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S3 



odu.dlu.rreq « 0 
Ignore.data « 0 



cdu_d{u_rreq - w 
fgnore^data » 0 



Beset OR pf«y n — 
cdu_djLi_ rreq s o 
ignofe_data « o 



< 



C 



I 



idle 



> 



DoneBflfvi 



cdu_dlu.rrBq e o 
Ignoie.data a o 



req 



odu_diu_rreq a t 
ignore.data & o 



ack 



3 



Hflh — 1 



odu_diu_rreq a 0 
fgnore^data a 0 



read 



> 



jam SOiirnft itrirrrt r^M] ' 



end satirr^ fftff 



Odu_diu.rfBq « 0 
lgnore.data 1 



22.5.5 



Figure 101. State machine to read compressed contone data 
Compressed contone FIFO 

wi.h Ss^S ciSn^E Stid S'' f?' -'''^ "■^'^ accommodate two 256-bit accesses) 
ten to the FIF?Zm^S --'--/-^-^Aag. Whenever 64-bit data i^Si 

sion of the same. ' ^«^"«>"«^'^«cA register is also copied to an image ver- 

sponds to bits 7-0. second b^e^o b^te Tl ete Wf^^ m '° f "'l' ''^^ '^-^ ^ "^^^ ^<'"^- 

oyte to bits 1 5-8 etc.). If bit 64 ,s set on the read, bits 63-0 contain the end of the 




. FIFO (as an additional effmlr^s L^rh^^'^tnT^^ '^'^ 

tone data must be more than 4 x or 32 byTes. L S^Jj ° ''^ 

22.5.6 CS6150 JPEG decoder 

the CS6150 JPEG decoder ccZl^Vlt^l^^T'^.''''^. ^'^ <^^^°'^ ^a^^ ^ t J 
which a gated version of the systemTock pcfk. thl T Jchno ogy). The core is clocked hy Jclk 

JPEG decoder on a single color pixd-by n«d b«2^* ^1 1 1*" S"^**'" mechanism for stalling the 

s:;id^2S^-''^^''-'^-----^^^^ 

quantization tables, restart interval deSn^^ ^1 bytestream contains data for 4e Huffman tables 
*e JPEGbytestre,;„ automSytlS;;^^ 

fying the JPEG segments the decoder re-dirite Ae^f«^f ! ?° segments. After identi- 

as appropriate. Any errors detected in SeljTesti^l^ ^'S!!?^."^*' *° " P~<'««^<'<1 

«^edand.ifanerrorisfound,thedecoi:rsr^^ 

Lines (DNL) marker at the ^n^^o^ly^l^i^;^^?'^^^^^ ^ dumber 

length as this is a modification to the^re »° '"^8^ °f '"ore than 64k lines 

Sd?om DlS^^Tbe'lntc^e'?^^^^^^ ^.P«^P^register. If this agister is set. then the data 
Pixels in theco^ect color or^T^rdLl™;:^^^ 

Tlie following subsections describe the means by wWch the CS61 50 internals can be made visible. 
2ZS.6.1 JPEG decoder parameter bus 

nmics which internal parameters are diZlTZ^ T '"P"* i^PgOecPType) deter- 

the PK«;.e port does Z^^Z cTntTSs^ed^yTS iS" ""^'^ ^ '^^'^'•^ ^ 



Table 99. Parameter bus definitions 
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Table 99. Parameter bus dofinitions 




Cs0[7:0LTq0(1 :0LV0(2:01 
-HOP.OJ 



0x5 



0x6 



0x7 



0x9 



OxA 



Cs1[7:0LTql[i:0LVl(2:0j 
-H1(2:0I 



Cs2I7:0LTq2Il;0LV2[2:0] 
.H2(2:0I 



CsO: WenUfier tor me first scan component 
TqO: quanttotion table identiffer for the first scan compo- 

vSu^*Tlr'"''*^'^^^'^'°'^^ first scan component. 

Il°n.*T,'^°"^' sampling fector for the first scan compo- 
nent. Values = 1-4 *^ 



Csl. Tqi, VI and Ht for the second scan component 
VI, HI undefined if NS<2 



Cs3I7:0LTq3l1 :0LV3J2:0J 
^H3{2:01 



CsH[15:0] 



CsV[15:0l 



OxB 



ORI[15:0] 



O0O.HMAX[2:01_VMAXf2: 
OL MCUBLK[3:0LNS[2:0J 



^^2. Tq2. V2 and H2 for the second scan component. 
H2 undefined if NS<3 



Cs3. 1 q3. V3 and H3 for the second scan component ' 
V3> H3 undefined if NS<4 



CsH: no. of rows In current scan 



CsV:no.of coiumns in cun-ent scan 



DRi: restart interval 



22-5.6.2 JPEG decoder status register 



HMAX: maximal horizontal sampfing (actor in frame 

^"^J "^"^^^ sampling factor in frame 
from ^ ''"''^"^ 

NS: number of scan components in current scan, 1-4 



The status register flags indicate the current state of the CS6150nn^«Ho« miu • . 

mg the decoding process, the decompression proems in tS JPEoT^^ ^^'^^'^^ 

sent to the CPU by asserting cdu icu ^Zl^^J^^ the JPEG decoder is suspended and an intermpt is 

the JpgDecStatus rezister ThT CSfi I SO l^f. ''^^^^ ^^^^^ «^or by reading 

high to indicate an error condition as defin^?;; T^i^^^^^^ ^^^^ ^ ^^^^ ^^ve 

more err;r.. "'P"' ""^^ "^'^^ Start Of Fmage (SOI) without triggering any 

l^ ^^l^l: ^1^^ decoder statu s register defi nitions 



15 - 12 



11-8 



TblDef[7;4] 



TblDefl3:0J 



OecHfEmor 



Indicates the number of Huffman tables deffn^dltS^ 



Indicates the number of guanUzation tables defin,>rt ihi./,oh.^ 



Set when an undefined Huffman table symbol is referenced durioQ decodinc 
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SI 



Table 100. JPEG decoder Status register definitions 



CUEfTor 



HtEnror 



QtError 



OccHnror 



Note mo, SoPEC-slS^l'i^^Z X ^s^^r'' 

64klir,Bs. '^'^"""SoHe.gMlsO. Ttvs Is to ello,^ images longBr than 



Set when an invaW OHT seflment is delertwi 



Set when an Invalid DQT saflwent is detectPri ~ 



IDctlnProg 



DeclnProg 



JpglnProg 



Set when anything other than a JPEG marker is input 

Set when any of DecFiags(6:4J are set 

^Z^^^^^-;-;^<^- « .nco^piete Huff- 
:=e"dre"^^?.t.3^' ^"°^'^'°-"'^ • 

scanisco.plete.rndlcates'^^rareaitr d:^^^^ 



22.5.7 Half-block buffer interface 



o A / ~~ ... wi^w iity Qiaie. 



at its output, each buffer is a Jf JpSi bSv i"1^bltt " ^.''T' ' ''""•'•^-buffer of 2 x 256 bits 
to stall the /PEG decoder core at its SJ^ut on a ' ™^ *<> be able 

pixel). We provide a mechaiusm for sXg L I?EG d^JoSlT^ boundary ,.e. after 32 pixels (8 bits per 
Jpg.core_stall is 1. The half-block b^ZSrf!!ei^Z»'l ^"^^ ^ "'"''^ '° ^^en 
half JPEG blocks to decouple JPEG deLSnf?-^ f« P'^^'^^g ^ of double buffered 

DRAM (write control unit)^ata comingSt^^^^ ^^'^^ JPEG blocks to 

only a single color plane. Data «iSe^e o^^eTn^ei. ^""^ °^ '^'^ ^'^ 

The half-block buffer interface therefore consists of 2 single JPEG h»if w u u «• 

combuiatorial logic, as shown in Figure 102. half-block buffers and some simple 



PO-Core^stall ^ 



IcIk.enabJe ^ 



half-block tittffer Interface 
^ 



half-block buffer 
select unit 



oontone 
piane 
buffer 



64 
-7^ 



rd^adv,hatf,block 
— rd^adv 



hal/.block,ok.to_read 



cdu_diLr_data[63;0J 



Figure 102. Block diagram of half-block buffer Interface 
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2Z5.7.i Haif'bfock buffer seiect unit 

thas case, each buffer U a half JPEG block. i.e. 32 b^SS^^^^^^^ 

S^-a'^i'Jv.ltrL^' buffer: and 
single bit (H._*u^ for the cu^em S 

hatf_block_ok_to_read equals buff avauZ lX ^ " °"*P« value 

i^#-ava///>.._*«^.When;>^^^^ y/>^.o«_..a// equals 

the production of pixels. The clock gatine is oeS™^ ;! a " ^ated off so as to stop 

output from the CDU. V/h^nlclkTllt^^ t ^^'',^^^^^^^ enable 
Oclk_enatU is the invL^J^g^f "t^^l^. ^'"^ When Jclk_encble is 0. y^}^" 0 

p^c:^^^^-^:S?^ro?th1n^^^^^^^ ^ene. 

mented whenever pir_ou,_va/W is l md ^.^^1^7 ll ^'^ "^""^^^ 
pixeI_countf4:0J is 31, buff avail fwr ii^Tset uJl ^ '^''^'^ "^"^ When 
;.u_o«r_va/WANDedwith&einvi^of^^^^ T"*^" "^^ »-_e„ equals 

ANDed with rdLarfu. or JPg_corc_staIl. The output equals half_block_ok_to2read 



22.5. 7.2 Contone plane buffer 

Each^wntone plane buffer consists of two half JPEG block buffers 



as shown in block diagram form in Fig- 



rd.jbu«. 



rd_en_ 



wr_buff_ 
wr_«n" 



JPEG 
half-block buffer 0 



pixel.data. 



8 



pixer data . 



JPEG 
ttttf-btock buffer 1 




odu_dfu_data(63:0J 



contone prane buffer i 



Figure 103. Contone plane buffer Interface 



lected at the first shift 4ster n gtSJ'J^S^^^^ - 4 e„t,y , 64-bit. Dau is col- 

.sterin64bitquan.ties.Dataisread^omthern?s:^-^^^^^^^ 
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22.5.8 Write control 



unit 



^^^^^B^m^^sh^mZ^l^ST^^^"^'^^''''''' '^'^ ''''''^ ^ DRAM with the 



DRAM 




Impfles 4 X 64 bit writes to consecutive 

CDU access to DRAM 
CX - Cotor X 

LY - Une Y or e bytes of a Bne in a JPEG btock 

a p^z ©Its 63-0, line 3 in word p^3 bits 63-0, 

block 0, color 0. line 4 in word q bits 63-o lin« <; . 

n«= ■ ^ oj line S in word q+i bits 63-o 

line 6 in word q+2 bit:fi • . oj-o, 

q oica «-0, line 7 in word q+3 bits 63-0, 

block 0, color 1, line 0 in word p bits 127 i . n . 

line 2 in word p.2 bits isTfi* i"^ ' ""'^^ ^^"^ 

P ^ bits 127-64, line 3 in word p^3 bits 127-64, 

block 0. color 1, line 4 in word q bits 127-64 Hn^ c • 

c ■ , ^-i-i-t* line S in word a+l bits 

line 6 m word Q4-2 bits 19-7 i- ^ ^ *d j- oics i^/-64, 

q oits 127-64, line 7 in word q+3 bits 127-64, 

repeat for block 0 color 2, block 0 color 3 

block 1, color 0, line 0 in word p*4 bits 63-0 i<« . 

» * Dies 63-0, line 1 in word p+5 bite 63-0,- 

etc. 
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are masked usinp 

only 64 bits out of the 256-bit access to DrJjJ^ vTd L from the CDU 

by the DIU. nus means that the decomp«^co«o^ the reinaum,g bits of the write are masked 

wntemasl«daccessesto4consecutivelS^S.Si:^^^^^ 

™„e in Pi^ .5. T.e 

fuilLblock_ok^to_r^ and line store oTto^ti^..^^^ '^ t\*^^ ^ "^W-e relies on the 
block to DRAM. Once the h^K-blZl^tk^^^, In^Tt'^r^J^^' *° « •'^^ JPEG 
requests a ^vrite access to DRAM by asserti«! cr^^lZ^ ^ ^ '"^^^^ne 
.ng to the first 64.bit value to be written l^c^dfu ZZ^, ^ ^u^"^""^ ^'l^^^- coirespond- 

access of 4x64 bits is issued by TSu 1^75^^^ ^""'^ ^ «-bit value in each 

fourth 64-bit values). The state machS^^^JaSyo rSef^rr^^^ '"fTt 

mg a read of 4 64-bit values ftom the half-block buffer in^l^ acknowledge from the DIU before initiat- 
put cdu_diu_^alid is asserted in the cycle Ser ^ '''V^^ '^"'^^ ^ '^^l^'- The out- 

the cdu_diu_data bus and should ^SLt k^^^T^r valid data is present on 

•s then sent to the half4.1ock buffer^tSe to iTcTcS t^^^ 
shou,dn^beavailabletobewr^entoagain.rr:i?^e^^ 

f^CSdS'Te^iSrithn^^^^ 

" 'm^^^ -ddreas output to dram 

// corresponds to linetiumber, onlv fH-.i- .-..a 

// issued for each DRAM^cess i, 1 ? 

// access. Thus line is alwava O 

cdu_diu_wadrt4:31 = color S^erates these bits of the address. 

if (half == X) then 

^^^cdu.diu.wadr,.l:„ = upr_half.loc._adr ^^^^^ 
cdu.di«.„adr . l^_,.,,,,..,^,^ ^^^^^ ^^^^^ 

inrtadvial^iTocrrrirther-^"^^ — -cess 

If (half 1) then 
half = 0 

if (color maxj>ianG) then 
color « 0 

if (block wax^block) then // 

pulse wradvSline °^ writing a line of jpeg blocks 

block a 0 



// 
// 



update half block address for ef.-^- -.r 

account of .ddreas wra^rnfln c?rcular"hu« "^"''''^ ""^"^ 

if rupr_h,lfbloc._adr == iS t^ln ' '"^ ' 

upr_halfblock_adr = buff start'.d^^ 

^j^^P-^-h-lfblock.^dr = bu££_st«t!r^ bu£f_end_adr ) then 
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else "''^-'-^">l->^-<^ ■ «Pr.haIfbloc,^.d. . . 3 

block 

upr_halfbaock adr // 

else -"'^ to addresB for lines 4-7 for next block 

color +4- 

elee 

half = 1 

if (color == maxjplane) then 

if (block max^block) then // end of writing ^ i- 

ena or wrrting a ime of JPEG blocks 

lwr_halfblock_a<Jr - buff atart"-.!..^ , 
elsif (lwr_h.lfbIocK^dr ri;:^i;:^/i"!:-"|f * ^ , 

lwr_halfblock.«clr » buffet ' than 

else " ~ 

lwr_halfbloclc_adr = lwr_halfblocK_«dr ♦ m«_block * 2 



else 

lvn:_halfblock_adr ♦ + 



// move to address for lines 0-3 for next block 
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Go 

cdu_tftu_wvalkl 
rd.adv_half_btock e o 



>0 
"0 



Outi_<Jru__wreq « 0 
rd^advnO 

^ reset ^ 



idle 



c 



rd_adv « o 
f<^_acfv_half_biock « 



req 



c 



> 



half btork nk tn man ~ 



otf u.dtujwreq s i 
cdu_dru_wvaad a o 
rd_adv o o 
rt.adv,halLblock« 



ack 



c 



cau_dio_wreq «» o 
cdu.diu.wvaJjd « 0 
nf_adv m 1 

'd_adv_halLblock » 0 



read 



c 



cdu_dlu_wreq 
odu.dtu vwau 
rd_adv a i 
«J-adv_half_blo<* b o 



s t 



write 1 



c 



3 



cctu_dlu_wreq 
cdu_diu.%waB 
fd_adv « 1 
rd.adv.half.b(ock ° 0 



> 1 



write2 



3 



cdu_dlu_wre<i = 0 
cdu.d/u.wvalid s i 
rd.adv B 1 

fd.adv_hartLbfock- l 



writes 



c 



3 



cdu.dtu wreqt=o 
cdu^diu^wvaiid e t 
fd_adv a 0 

ti^adv^hatf.Wock = 0 



write4 



> 



cdu_diu.wreq s 0 
otf u_dUj_wvarid » 6 
nS^adv = 0 
nrf_adv_haff.Wock » 
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22.5.9 Contone tine store interface 



The contone line store interfk" provi^^i^ mJLhl^^^^^^^ ^t'T" line-at-a-time. 

write to. mus the size of the Une storein DR^vfT.?^ I .'^ ♦he CDU to 

line store interfece is 8 lines. ^oSra^^K^^!^!^ * "^^^ V "^"^ 

scheme while 16 lines proviis altb^iu^Se^^^^^ ^""^ '^^^ ^"ff- 

set to the value of ««« fa,^ //S^^TTre ™ mav n„^^! transitions fiom 0 to 1. numjines_avail is 

available for 8 lines. indicai^Hhe^ Ae T*" '" ^ '^"8 as there is space 

writing 8 lines, the ,^te contiol^t sendl ^-IS^ "'^ "^^^ ^DU has finished 
CFU. and «««_/«e,_«v«rirde^lTtert T 
ft«e^to«_aiLr<._Hr/retobesetagain^^rS??isr^^^ 

priately. and sends its own r^.A'/Si to Ae cSSJ^o . '^"«P°°^°8 '<> ^'^iySUne pulses appro- 
it finishes reading then.. 
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23 Contone FIFO Unit (CFU) 



23.1 Overview 



color invasion in up to 4 color pl Jes. and ftS feeZ^' H ^ by optional 

fonned in the horizontal and vertcaTi^^orby t Z^l^ ^"^'^^ ^ P«- 

pnnter resolution. Non-integer scaling is sunZ^L i^u^^ J^- *° "itches the 

23.2 Bandwidth requirements 

U^^Xyl^Sc^'^''' -o-^ »o *e -e at which the contone data 

direction is performed at the output oT^ i?V onTn^eT^^ T^"^ '° '^'^ '^'^ ^^^^^n in the X 
tion is performed by the CFU rSding «^h !inlTn!i,w ''^'^ "T^^" ''P"'^'"" 

DRAM. The HCU genenttes 1 do7(bf.Sin^.o^^^ Y-scale factor, from 

1 side per 2 seconds for MI b leS A4^i p^,'?'^^^^ -^-"^ P™t s^ed of 

color contone pixel (32 bits) every^^t^^wT^^r a"^""' ""^"^ ^''^ ^P""* ^* « ^ 

from DRAM at 5.33 blts/cyde' ^ ^ 267 ppi the CFU must read data 



23.3 Color space conversion 



and K. directly represented by CMYK Ste tS fou^ " ' '^"^ -"^^ M. Y. 

muIti-SoPEC printing with exact colore. ^^'^ "''P'^"' ^old. metaUic green etc. for 

cS^<Js":^X^^'S CM^^Tk^^^^^^^^ visible entity when luminance and chrominance 

luminance info^ati^Janll ^JJ^nee^To be^l^^^^ be luminance, but C. M and Y each contain 

fore provide the means by which Cl^^ ^l';r^T^prclTcSt T ^'"i 

sion. w oorc^^ as YCiCb. K docs not need color conver- 



^p" esS SrSS 1 «<> ^<^'<^ then finally JPEG 

to CMY. '^"°"-*''^^'^*^^''°btained.thencolorcom.ertedtoRGB,andfm^^^ 

The external RIP provides conversion from RGB to vrw-K . c „ 

.mplementation of the invent transfonn wSLn sfpEC 2^^^ SStSVnT''' 

are normalized to occupy all 256 levels of an 8-bit bfnSy en^^''' ^'^^ ^' ^'^ ^ 

The CFU provides the translation to cither RGB or CMY RGB ic in.i..^«^ • - 



1 . 32 bite / 6 cycJes - 5.33 bits/cycle 
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I color plane, no color space conversion 

• 2 color planes, no color space conversion 

• 3 color planes, no color space conversion 

• 3 color planes YCrCb, convereion to RGB 

• 4 color planes, no color space conversion 

. 4 color planes YCrCbX conversion of YCrCb to RGB. no color conversion of X 
t fte YCiCb to RGB conversion is described in ri41 Norp that ;^ a . • 



23.4 Color space inversion 



In addition to perfonni'ngoprional color conveision the rFiiai«,« 

ui up to 4 color planes. This provides the bv wwSf^.?^ ^P*^""^ ""'^^ '"^"io" 

may be used to provide planj correlation lZ^^"''^°'' to CMY may be finalised, or to 

*cf= 255°-^!^^ relationship: 

• M = 255-G 

• Y-.255-B 

Thesejelatiomhips require the page RIP to calculate the RGB fh,m CMY as follows: 

• G = 255-M 

• B = 255-Y 



23.5 Scaling 



sented by a numerator and a SStof^^i^^^ non-integer scaling with the scale factor repre- 
should be greater than or eaunlTrh7:r;r™f.!r."^ of the pixel data is allowed, i.e. the numerator 
the n«r is prognunTed as 5 an'dTSS^^^ ^^IT^l!^ ' of two a™, a half. 
Scaling is implemented using a counter as described in A« a ^ • 

ated to move to the next dot (x-scal^^) 0^1^?^^^^" pseudocode below. An advance pulse is gener. 



numerator 



if (count ^ denominator - numerator >= o) then 

count = count * denominator 

advance = i 
else 

count = count ♦ denominator 
advance = 0 
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J3 



Leao-in and leao-out cupping 

block Q below) will be the \^Kt jPPn ki Ju^*u f ^ *® boundary of the 2 SoPECs (JPEG 

line printed by Sec #2 Sefs1?,i^^^^^^^^ 

ately setting L LeadOuta^^un^^pTcl7S^%l^l^^ -PP-P- 
at the beginning of each lin^ Tben^^SXc^^^^^T^Tf ^% '"''^^ """^ ^ •8"°'^*' 
LeadlnClipNum register. ^ ^ line is specified by the 

It may also be the case that the CDU writes out mnn- n>Br> vi i, - 

as shown for SoPEC #2 below. In^sT^SZ^ ffle^tS/^T " •^'^''J'* ^FU. 
spond to JPEG block m but the value foTfte IZ^Jl A^ar^/oc* register in the CDU is set to cone- 
block «-/. Thus JPEG block^ il^ouSit by^cS;"'''^'" » '^'^ to correspond to JPEG 



SoPEC #1 
lead-In area 



,SoPEC»2 SoPEC #1 
iead-m area ^ lead-out area 



SoPEC #2 
lead-out area 




SoPEC #1 pilnts left 
side of page 



SoPEC #2 prints right 
side of page 



ngure 106. Lead-ln and lead-out clipping of contone data In multi-SoPEC envlronn^ent 

slSr^^^^^^^ are scaled up to the printers resolution. T^e 

Len^H register defines the "Je oft^^^^ -S-^^r. The HcuUne^ 

trols the scaling of the last valid pixe! i^S f^^^^^^ ^^^"^ ^ "^^^^^^O" ^^o- 
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23.7 Implementation 

Figure 107 shows a block diagram of the CFU. 



DRAM /nterfaco Unit 



T 



_ Contone 
Decoder Unit 



.^64 

_i 



cfecompressed 
oontone buffer 



_ wr_buff. rd_buff 



wr_en, fd_en 



Y-scaJing 
control unrt 



YCfCbgRGB 



Cb Cr 
cotor space converter i ^ - 
cp3 cp2 cp1 cpo^*^"-«>torptone 



8 



"8 . 



8 





i—2 






o 

: C 






■D 


■. 

i 1 


!' 


i 


CD 
? 


; >i 




1 




.'8 . 


^8^3-^ 





configuration 
regrsters 



'8 



ley ^8 

E 



output 
double-buffer 



wf.buff. rd_buff 
^ wr_ea rd^en 



▼ ^ ^ i t 



f3^3 I 



^ linaS ok 


^ 

to read 


.pontone 
line store 
intertsice 





X-sca(ing 
control unit 



-32 



Contone 
RFO Unit 



l: 



1 



Halflonc/Compositor Unit 



1 



PEP Controller Unit 



Figure 107. Block diagram of CFU 
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23.7 A Definitions of I/O 

Table 101, CFU port list and description 



Clocks and reset 



pdk 



pfst_n 



PCU Interface 



pcu_cfti_sel 



pCLf_rwn 



pcu_adrt6:2J 



pcu_dateQut[31:0] 



d»u_cfu_rack 



cfu,diu_radf(21:5) 



diu_cfu_rvalfd 



diu,data[63.-03 



CDU Interface 



odu_cfti_wfadv8ane 



cfu_cdu_rdadvfine 



HCU interface 



hco_cfu_advdot 



cfti,hcu_c2data(7:0] 



cf u,hcu,c3datap:0j 




System reset, synchronous active low. 



In 



in 



32 



Btoc* selea from the PCU. When pct/.cfti.se/is high both 
pcu^adranti pcu_dataout are vaHd. 



Common read/not-write signal from the PCU. 



PCU aodress bus. Only 5 bits are required to decode the 
address space for this block. -wwine 



Shared write data bus from the PCU. 




17 



64 



Out 



In 



Out 



CFU read request, active hfgh. A read request must be accom- 
panied by a valid read address ««»m 



Acknowledge from DIU, active high, indicates that a read 
request has been accepted and the new read address can be 
placed on the address bus, cfu^dlu_raar. 



In 



In 



CFU read address. 17 bits wide (256>bit aUgned word). 



Read data valid, active high. Indicates that valid read data Is 
now on the read data bus, diu_<iata. 



Read data from ORAM. 



In 



Out 



Write 8lme pulse, active high. Indicates that the CDU has fin^ 
ished writing to 8 lines of decompressed contone data to the dr- 
Oi^r buffer in DRAM and the data is available to be read by the 



Read line pulse, active high. Indicates tiiat tiie CFU has finished 
reading a line of decompressed contone data to the circular 
buffer In DRAM and that fine of the buffer is now free 



In 



Out 



Out 



Informs the CFU that the HCU has captured the pixel data on 
chj.hcu_c(0-3Jdata lines and the CFU can now place the next 
pixel on the data lines. 




Pixel of data In contone plane t . 



Pixel of data In contone plane 2. 



Pixel of data in contone plane 3. 
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23.7,2 Configuration registers 



The configuration registers in the CFU are programmed via the Pri r int^o^. or. 
Table 102. CFU registers 



Control registers 



iSnf 



I 0x00 
0x04 

I Setup registers 


Reset 
Go 


1 
1 


0X1 
0x0 


A write to this register causes a reset of the CFU 

Writing 1 to this register starts ttie CFU. Writing 0 to thts 
register halts the CFU. 

When Go Is deasserted the state^achlnes go to their 
Idle states but all counters and configuration registers 
keep their values. 

vvnen uofsasserteo all counters are reset, but configu- 
ration registers keep their values (I.e. they don't aet 
reset). 

The CFU must be started before the CDU is started. 
This register can be read to detennine If the CFU is nin* 
ning 

(1 - running, 0 - stopped). 


1 0x10 
1 0x14 


MaxBtock 


13 


0x000 


Number of JPEG MCUs (or JPEG bk>ck equivalents i.e 
8x8 bytes) In a line -1. 




BuffStartAdr 


15 


0x0000 


Points to the start of the decompressed contone circular" 
buffer in DRAM, aligned to a half JPEG btock boundary. 
A half JPEG block consists of 4 words of 256*bits, 
enough to hoM 32 contone pbcels In 4 colors, i.e. half a 
JPEG block. 


0x18 


BuffEndAdr 


15 


0x0000 


Points to the end of the decompressed contone circular 
buffer In ORAM, aligned to a half JPEG bkx* boundary 
(address is Inclusive). 

A half JPEG Wock consists of 4 words of 256-bits, 
enough to hold 32 contone pixels in 4 colors, i.e. half a 
JPEG Wock. 


1 OxIC 


4LjneOffset 


13 


0x0000 


Defines the offset between the start of one 4 line store to 
the start of the next 4 line store, in Rgure 108 on 
page 294 , if BufStartAdr corresponds to line 0 block 0 
then BuffStartAdr-^ 4UneOffset corresponds to line 4 
block 0. 

This register Is required In addition to MaxB/ock&s the 
number of JPEG blocks in a line required by the CFU 
nrwy be different from the number of JPEG blocks In a 
line written by the CDU. 


0x20 


YCfCb2RGB 


1 


0x0 


Set this bit to enable conversion from YCrCb to RGB. 
Should not be changed between bands. 
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Table 102; CFU registers 




0x28 



0x30 



0x34 



0x38 



0x40 



0x44 



HcuUneLength 



LeadlnCflpNum 



LeadOutCHpNum 



XstartCount 



XscaleNum 



XscaleOenom 



YscaleNum 



YscaleOenom 



0x0 



0x0000 



0x0 



0x0 



0x00 




Set these bits to perU^rm bit-wise inversion on a per coJor 
plane basis. p^'wior 

bitO - 1 \men color plane 0 

- 0 do not convert 
biti - 1 Invert cofor plane 1 

- 0 do not convert 
bitZ - 1 1nvert colof ptane 2 

* 0 do not convert 
bits - 1 Invert color plane 3 
Should not be changed between bands 



Number of contone pixels . 1 in a line (after scaling) 
Equals the number of hcu_cfu_dotadv pulses - l 
received ftom the HCU for each line of contone data 



Number of contone pixels to be Ignored at the start of a 
hne (from JPEG block 0 in a line). They are not passed to 
the output buffer to be scaled in the X direction 



Number of contone pixels to be Ignored at the end of a 
line (from JPEG Wock MaxBfock in a Une). They are not 
passed to the output buffer to be scaled in the X direc- 
tion. 



0x01 



0x01 
0x01 



0x01 



Vafue to be loaded at the start of every line Into the coun- 
ter used for scaling in the X direction. Used to control the 
seating of the first pixel in a line to be sent to the HCU 
This value will typically be zero, except in the case where 
a number of dots are clipped on the lead in to a line. 
Numerator of contone scale tacfor In X direction. 



Denominator of contone scale factor In X direction. 



Numerator of contone scale factor in Y direction. 



23.7.3 



Denominator of contone scale factor in Y direction. 



Storage of decompressed contone data in DRAM 

The CFU reads -decompressed contone data from DRAM in single 256-bit accesses JPEG hinrVc of 

256 bS dSSI ^ ^ ^'^ «-bits in 4 colocJfrom a single line in each 
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23.7,4 



ORAM wprdp 
ORAM word fy+A 



4 line 
store 



DRAM«wordpMn 
« 

— DRAM word q 
ORAMwonJq44 



4 tine 
store 



ORAM word q^4n 



DRAM 



JPEG block 0 
Unes0lD3 



JPEG block 1 
I{nes0t0 3 



JPEG Uock n 
RnesOtoa 



JPEG brock 0 
Dne84to7 



JPEG block 1 
lines 4 to 7 



JPEGbfockn 
tines 4 to 7 



255 



191 



127 



caj 



63. 



3yi C2L0. nun» /^Qj.^ 



C3^1 I C2L1 C1L1 ■ 1 I wordp+1 



C3^ I C2Lg C1L2 ; e^tol wordfH^2 



C^a j 0?l3i.Cliaj,JCPL3l wordp^3 . 



2S5 



191 



127 



C3^ 1 


C2L4 I 


C1U 1 C0L4 


C3^5 1 


C2U5 1 


1 

C1LS 1 COLS 


C3^6 1 


C2L6 4 


C1L6 1 coue 


03^7 1 


C21.7 1 


C1L7 1 C0L7 



WDrdq 
word q>1 
wordq4^2 
word q-fS 



tmpOes one 25a l>it read of a word in DRAM 



> CX.CotorX 
LY - Une Y or 8 bytes of a line In a JPEG block 



Figure 108. DRAM storage arrangement for a single line of JPEG blocks In 4 colors 

The CFU reads data line at a time in 4 colors from DRAM. The read t 



as follows 

lino 0, block 0 in word p of DRAM 
line 0, block 1 in word p+4 of DRAM 



I sequence, as shown in Figure 108, is 



line 0, block n in word p+4n of r»AM 

(repeat to read line a number of times according to scale factor) 
line 



1, block 0 in word p+1 of DRAM 
line 1, block 1 in word p+5 of DRAM 

etc , 



The CFU reads a complete line 



'^Jlf u'''^^X ""^^'^^ ^^^'nes from DRAM before it 



that 4 line store 



Decompressed contone buffer 
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23.7.5 Y-scalmg control unit 

that writes are to occur to. * bit (vvr.faij?) for the current buffer 

of d«a fi,„ DRAM to fte b,^ ,a,c^ bJlSi^tS^ l^rtSc " " 

diooion i> thus pofoimcd. "ntoDe data. Sc«lu« » U» pHndread lesoludon i« the Y 



// oaaign read address output to DRAM 
cdu.diu.wadr[21:7J = curr.ha If block 
Cdu^diu^wadrl6:5) = lined :0} 



" i£"'^^_rSu«":rx^-''"^^-"""^ »<^<^-»- -^-^ e.ch OR^ .ea. access 

" i^lTro // e„<^ Of roaai„« a li„e of contone in ^ . .^i^.s 

— ^ y_scaie_caenom - y scale num >- fti t-K^^ 
y_sc«le_coune - y_9c«le_count ♦ y scale f 
pulse RdAdvline V-Scale.denom - y_scale_num 

if (line == 3, then . , ^^^^^ ^^^^^^^ ^^^^ 
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line 



curr.halfblocJc = buf f^start.adr 
linens tar t^adr « buf f_start_adr 

lxne_start_adr = buff^start adr 
else " 

curr_hal£block - line_st«rt_.<lr ♦ 4Hne offset 

else 
line 

curr_halfblock = linens tart adr 
else " 

// re-read current line from DRAM 

IzIT11t2TV ^7f^*^«-"^«^ * y-Scale,denoni 
curr_halfbloc)c = line.start adr 
else "* 

block ■►-f 

curr_half block 





cfu.diu.rreq « o 
wr sel 0 



cfu.dii4_rr6q e o 

^ resetj 



< 



idle 



3 



cfu^dlu.rrdq o o 
wr^seioo 



req 



c 



> 



ffnea n^f |p mad Af>f p 



fiJff pK to wrftn 
cnj.dlu.rreq s i 
wr.selBO 



ack 



c 



y 



wr.ser e 0 
wr.adv_biiff « o 



read! 



c 



cfu_rftu_req a o 
wr_sel « 0 
wr^adv^buff = o 



read2 



c 





dilLXfu rvaffri 1 




cnj_Gtu_rTeq « 0 




wr_6eJ= 1 




wf,a<fv_buff = 0 




r 



reads 



c 



3 



St 



chj n/artri ^ 



Cdiu_rreq » o 
w_adv_buff « 0 



read4 



> 



i_diu_rreq « 0 
wr_sel 23 3 
w.adv.buff 3 1 



Figure 109. State machine to read decompressed contone data from DRAM 
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23.7.6 Contone line store interface 

Tht cooune Ifae am. inietlte provide ilL ™«h.„^.„ 7 f ^ tl»n> llitt-M^lime. 

DRAM when the CDU has vr^U 8 coSeTnL of 11^ L^'^u"^^ ""'^ fe- 
lines, it sends an cdu_cfU_^ad^>8line X to ^ L Sj^r^ 

CFU may continue reading from DRAM as l^.T.^'r « mcremented by 8. TTie 

set while buffJines.avJus grSter rtTo h ^£^'^7?^^'.^^^'" iina-ok.to_read is 

from DRAM, the Y-scalbg ^mT^t.eJ^rRli^r^^'^'t ^"^"^ ^ 
CDU to free up the line in L bu^erToS^ ^ ut^IT? h '^'"'^ »^ 
vline pulse. .s decremented by 1 on receiving a 

23.7.7 Color Space Converter (CSC) 

RGB. lfYCrCb2mB^^^^l^ZV::;r° ''T^''' ^"-^^i"" from YCrCb to 

second stage. The 4th clr pi« i?L^t k^^! P'^^' '^'^ ^^e passed to the 

latency of the com^ert YC^b Sgb W^k L^'S^ '° '^^^B block. Note that &e 

plane as it bypasses the block. """^"^ »^ "'I'^lized for the 4th color 

YCrCb to RGB. and ^m^rt.clJwXb^sr^^J, nt! th ' ^™ 

unchanged. -/^ «^ can oe set to 01 U to then convert die RGB to CMY. leaving K 

rf KC><»2i?<» equals© and /m/crt co/oro/a/ieeaua]«nnnn i 

take place, so the output pixels wilfbe thef^Ta^Lp^^^^^ "'"""""^ 
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Figure no shows a block diagram of the color 



space converter. 




YCrCb2RGB 



'nve>tjcolor_plan8 



23.7,8 X-scaling control unit 



Rgure 110. Block diagram of color space converter 

version is implemented as follows- accuracy is mamtained with 1 8 bits. The con- 

• R* = Y + (359/256XCr-l28) 

• G» = Y . (1 83/256XCr-128) - (88/256XCb-128) 

• B* = Y + (454/256XCb-128) 



J'^SS^t^nLVit'^^^tetYl^^^^^ -'^^ -d the HCU. The 

the mechanism for keejiag ^Ic of the c^nttd ^^^t^^^^^ -0^"*^-. proJZ 

read from until it has been written to. ouners, and ensures that a buffer cannot be 

that wntes are to occur to. * (*«r_6i/i^ for the current buffer 

_«av IS I . rixeis in the lead-in and lead-out areas are 



1. -179 is saturated too 

2. 135.5. with rounding becomes 136, 

3. -227 is saturated to 0 
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if (wradv i) then 

if (pixel.count (nax^block, bill) ) then 

pixel_count = 0 
else 

pixel^count ++ 
if ( (pixel_count '< leadin_clip_num) 

OR (pixel^count > ((max block feiii\ i ^ 
wr_en = 0 * *««x.oxocx,blli> - leadout_clip_num) ) ) then 

else 

wr_en el 

When a wr^en puJse is sent to the output double-buffer hf./r r^^^ rr i^^- 

The output cJu_Hcu_a.au equals Buff ZT ^''"^'^ 

HCU that data is available to be ^^'C^^^ic^' ^^^i^fT'! ^"^'^ to the 

rr«;t.r.?ri-r'"''^«'"^ 

algorithm for tH>n-integer scaling is d^criSdTSl « implemented by pixel implication. TThj 

loaded with x^tarr_cl„t after ^^^t d aj'^ end of^^^^ -^cale.coun^ should b^ 

fi«t pixel is scaled by. hcujine length and ''^ ^"ch the 
linethatissenttotheHcJisscaledby 

if (hcu_cfu_dota(3v == 1) then 

rd^en « I -^«unc + x^acale^denoxn - a^scalo^nmn 

else 

^c-lo_cou„t = x.ocle_cou„t * ^sc.le.denon. 

else 



x^scale^count * x_sc«le_count 
rd_en = o 



^omtKe^s^fi-rtjro-firrcS 

received then a „/_e« pulse is gemat-ed t^^J^l^^Zi^'^^l ^\^ hcu_cJu_M pulse is 
reset to 0 and x_scale_c^m is loaded vn^l^SJ^co^. '^'-'^^-^oun' is 
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24 Lossless Bi-level Decoder (LBD) 



24.1 Overview 



pass-through mode is PcoviS^rf^* ^^rpS^'"^^^^^^^ available. A 

50:1. Lossless bi-level compiession L^S^lvl?^"^^^^^ 

which compress poorly. ^ " P"*** » -^w"* 20- 1 with 10:1 possible for pages 

o'Ji^rr s%7spoT fV;s^5^ :rLd ^ data is 

umt) for the next stage in the prin^t^g^betoe^ r Ri^^ *° "^'^ (Halftoner/Compositor 

is used by the PCU L is a JS'Jfnt^^oTc^^ 'WjJni,*.^ control fUg that 



ORAM 
imerfacd Unrt 



PCU 



^b<i_finishedband 
< 



I 



LBO 



Spot FIFO 
Unit 



HCU 



24.2 Main features of lbd 



Figure 111. High level block diagram of LBD in context 
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Figure 1 12 shows a schematic outline of the LBD and SFU 

at 1 600 dpi. ^ therefore be long enough to store a complete line 

Spec ?o iS^^^^^ to the HCU. This throughput capability is retained for 

PECl LDB.outputs iSfn paS cH^^^^^^^^^^ ^ JJ?' »7 only read T dot/cycle T^J 
the LBD in SoPEC can run m^fch faste tlS i^^Sl ^^JlJ^'^^ ""^"^ 5°**^^. THerefore 
processing latency, to be absorbed. ^ * "^^^ «^'o^« Stalls, e.g. due to band 

grammed number of bits. SJche^Jr i^koS^r T T *° °f ""^ °' ^ P«'-P«>- 

length code, followed by pass through. run-length code is always executed as a run- 



S3 Proprietary Document 



29 Nov 2002 




A signal sfiijdb_rdy indicates that both the SFU's Nertli^^jmrn ^ o , . 

wnting and reading, respectively. ^^''U s NextUneFIFO and PrevUneFIFO are available for 

Kbytes of storage. ^ I 'ltDytes of storage. An A3 line of 19488 dots requires 2.4 







LBD 












FIFO 







DRAM read 




Air RFOs are 64 bytes 
(twice the ORAM data 
word width) 



Us 



SFU 



16 



FIFO nextjfne 



FIFO 



prevjine 



64 



64 



ORAM Write 
DRAM read 



FIFO 



currJIne 



HCU 



Figure 112. Schematic outline of the LBD and the SFU 
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24.2.1 BLlevel Decoding in the LBD 




E 

8 

It 

So 




tio 



010 



110000 



010000 



100000 



000000 



<RLxRL>l00 



Pass Command: aoZbT^i^^ 



Veftfcal(O); aO b1 , colpf « fcoior 
Veftica((l): aO <- bW i. cofor j^^ST 



VerticalM): aO <- b1 . l . color o icotor 



VefticaK2): aO ^ b W 2, cdor ^ 



VerUcal(-2): aO <^ b1 ■ 2, color = Ic^j^ 



VerticafQ): aO b W 3, color icotoT 



Verticai(.3): aO 4^ b1 > 3. cotor (color 



Horizontal: aO aO + <RL> + <rl> 



by a special run-length code. Pass th^^moZ c^^ f ^ '"^^^^ *s ^tivated 

number of bits, whichever is shorter Th!s^cS^n \ Zt ^"""^ '^"^ ^ pre-progiammed 

followed by pass through. The ^2 tij uj^^^^^ ^^^^ ^^^--^ as a run-len^Tde 

than or equal to 31. ^ ^ ^'^'^ * "^^^ length run-length with a nm of less 



Table 



^^^niength (RL) encodings 




o 

p c: 



RRRRR1 



RRRRRRRRRR10 



RRRRRRRR10 



RRRRRRRRRRIO 



RRRRRRRR10 
RRRRRRRRRRRRRROQQ 



RRRRRRRRRRRRRRROQ 



Sboft Black Runtength (5 brts) 



^ort White Runlenflth (5 bits) 



Medium Black Runtength (lo bits) 



Medium Whte Runtength (8 bits) 



Medium Bladt RunJength with RRRRRRRRrr 31 ' 
Enter pass through 01 . 



Medium White Runtength with RRRRRrrr af 
Enter pass through ' 



U)ng Biack Runtength (15 bits) 



tx>ng White Runtength (15 bits) 
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24.2.2 



Jie ccjdmg scheme of Table 104 it is' till legi tot^.^ jjl "^tl ' '^"^ '^'"^ 
beendes^pedsothatif ashortnmlengthvalSe i/dlS^Jin/^^^^ or long tx^ength. The LBD has 
command containing this nmlength if de^L T^, t"'''*.™ *en once the horizontal 

mode and the bits folIoMng the SSlenX'?„ ^^^^^^J^l ^BD to enter pass throuS 

eiAer a progianuned number of bits orSe en! olS J^ht?" ^'^ "^^"^ *° ^^^^gh 
mode.comp.etedthecurrentcoloristhesai^S:i'^tTa^J^^^^^^ 

DRAM Access Requirements 



Table IPS. DRAM bandwidth requirements 



Direction 



Read 



Maximum number of 
cycJes between each 
2^-brt DRAM access 



256^ (1:1 compressfon) 



Peak Bandwidth 
(bits/cycre) 



_ i ^ c ompression) u.i no:l co 

1 : At 1:1 compression the LBD requii^s 1 bit/cycic or 256 bits Jy 256 cycles. 



Average BandwJdth 
(bita/cycle) 



O'l (10:1 oompresslon) 
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24.3.1 Definitions of lO 



Table 106. LBD Port Ust 



Clocks and Resets 



pclk 



Bandstore algnais 




Global reset islgnal. 



cdu_endofbandstore(21 :5J 



cdu.8tartofbanclstore(21 :5J 



Ibd^finishedband 



OIU rmerface alflnate 



17 



17 



In 



In 



Out 



Address of the end of the current band of data 
256-bit wrd altpned DRAM address. 

<^*;»f«> start of the current band of data. 
2S6>t)lt word aligned DRAM address. 



LBD finished band signal to PCU and Interrupt ControlteT 



lbd_diu_rreq 



lbd_diu_radfI21:51 



<^iu^lbd_fack 



17 



diu.dataf 63.-0] 



dii;_l!}d_fvalid 



64 



PCU Interface data and comrpt signals 



Out 



Out 



In 



In 



^n^ITt^ ?^ ^ ^^"«st must be acco^; 
panledbyavalidread address. 



Read address to DiU 
1 7 bits wide (256^)lt aligned word). 



^^T^^^ '^^^ '^"^ been" 

'^««<« address can be placed on 



In 



Data from DIU to SoPEC Units. 
Rrat 64-bits is bits 63:0 of 256 bit word 
Second 64-blts is bits 127:64 of 256 bit word. 
Thjfd e4-bils is bits 191 :128 of 256 bit word 
Fourth 64-blts Is bits 255:192 of 256 bit word 



^T. 1;°"^ SoPEC Unit that valid read data Is 

on the d/u_dsta bus 



pcu_addr(5:2] 



pcu^dataout(3l:0] 



ibd__pcu^dataln[3 1 rO J 

poj^rwn 

pcujbd_sel 



lbd_pcu_rdy 



32 



32 



In 



Out 



SFU Interface data and control signafs 



In 



In 



Out 



TrfJif '''^ ^"'y ^ '^"'«d to decode the 

address space tbr this block. 



Shared write data bus from the PCU. 



Read data bus from the LBD to the PCU. 



Common read/not-wrrte signal from the PCU. 



^/riH^^'''' the PCU. When pcuJtxf^seHs high both" 
pcu_addran<i f)Cu_^<iataout are valid. 



m^nnf^ ^"^e cycie this 

to a read cyde th,s means the data on fbd^ datain is 



sfu_fbd_fdy 



In 



Ready signal indicating SFU has previous line data 
available for reading and is also ready to be written 





SSton: S''-'^"'^^'^-''^"''^" S3 Proprietary Document 



29 Nov 2002 
Page 306 



SoPEC : Hardw are Design 
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Table 107. LBD Conflguratlon Registers 



S3 




0x04 



Go 




0x0 



A %vrite to this register causes a reset of 
the LBO. 

Thfs register can be read to indicate ttie 
reset state: 

0 - reset fn progress 

1 - reset not in progress 



Writing 1 to this register starts the LBD 
Writing 0 to this register haits the LBD 
The Go register is reset to 0 by the LBD 
when It finishes processing a band. 
When Go is deasserted the state- 
machines go to their idie states but aJi 
counters and configuration registers keeo 
their values. 

When Go is asserted all counters are 
reset, but configuration registers t^eep their 
values (?.e, they don't get reset). 
The LBD should only be started after the 
SRJ is started. 

This register can be read to determine If 
the LED Is running 
(1 - running. 0 - stopped) 



0x10 



PassThroughEnable 



PassThroughDotLength 



16 



Wortc registers (n eed to be set up before processing a b^ 
' NextBandCurTReadAdr(2l:5J 



Width of expanded bi-levei fine (in dots) 
(must be a multiple of 16 bits). 



0x0000 



Writing 1 to this register enables pass- 
through mode. 

Writing 0 to this register disables pass- 
through mode thereby making the LBD 
compatifale with FECI. 



Number of dots for which pass-through 
nx)de win last. If the end of the line is 
reached first then passthrough will be disa- 
bled. 



(256-bit aligned DRAM address) 



0x18 



NextBandUnesRemaining 



0x0000 
0 



15 



0x0000 



Shadow register which is copied to 
CurrReadAcfrvthen (NextBanefEhable 
A Go = 0). 

NextBandCunBeadAdr 'tB the address of 
the start of the next band of compressed 
bi-<evei data in ORAM. 



Shadow register which is copied to Unes- 
Remaining vMen (NextBandEnable == i & 
Go*=0). 

NextBandUnesRemaining '^ the number of 
lines to be decoded in the next band of 
compressed bi-ievel data. 
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Table 107. LBD Configuration Registers 




0x1 C 



0x20 



NextBandPrevUneSoufce 



NextBandEnable 



Work fegtetere (read onfy for external acc^) 



0x0 



Shadow register which is copied to P/ov- 
UneSource when (NextBandanable i 

<SGOse0). 

1 • use the previous fine read from the SFU 
for decoding the first line at the start of the 
next band. 

0 - Ignore the previous line read from the 
SFU for decoding the first line at the start 
of the next band (an all O's fine Is used 
instead). 



if {NextBandBnab/e = 1 & Go = O) then 
'NexmandCuftReadAdrlz copied to 
CunRBadAdr, 

-f^QXtBandUnesRemaining Is copied 

to UnesRemaining, 
•MxtBandPfBvUneSouroe is copied 

to PmvLineSouice, 
-(Soisset, 

-NextBandBnab/e is cleared. 
To start LBD processing NextBandEnabte 
should be set 



0x24 



0x28 



CurrReadAdr{2i:5) 
(256-blt aligned DRAM address) 



UnesRemaining 



0x2C 



PrevUneSource 



0x34 



CurrWriteAdr 



FirstUneOfBand 



17 



15 



15 



The current 256-bJt aligned read address 
withrn the compressed bi-level image 
(DRAM address). Read only register. 



Count of number of lines remaining to be 
decoded. The band has finished when this 
number reaches 0. Read only register 



1 - uses the previous line read from the 
SFU for decoding the first tine at the start 
of the next band. 

0 - ignores the previous iine read from the 
SFU for decoding the first line at the start 
of the next band (an all O's line is used 
instead). 

Read only register. 



The cunent dot position for writing to the 
SFU. Read only register. 



Indicates whether the current line is con- 
sidered to be the first line of the band. 
Read only register. 



24.3.3 Starting the LBD between bands 

b'Jw5S:^i°atSdS'.S^^^^^^ P^^'T"^ ^ ^ ^''^^ for the compressed 

and then stops, clearing it's Go bit and issuine a^se^nT*^^ A^f ^ 
There are 4 mechanisms for restarting the LBD between bands: 
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b.The CPU programs the LBD's NextBandCurrReadAdr N^tRn^AT ,«-,d 
BandPre^IAneSource shadow registers and JL^La^^l^^^:^"'''t '^I'T 
current band At the end of the baml the L^Dc^^SoN^B^F^H, ^ ? *f 
LBD restarts immediately. ' ^^^<"^'^ble is already set so the 

aandFrevLineSource shadow registers and set NatBandEnable to restart the I Rn ThTVJ 
tage of this scheme is that the CPU could nrotscw h-nH i^IT • T 
commands in DRAM ready for ex^^o^ ' ^ """^^ "^^^ 

» the LBD restam mimediately. Simultaneously. Ibdjinishedband iriagets the PCU tSt 
commands from DRAM. The LBD will have restoitedby the tim^e pST w r? k 
mands from DRAM The pri r »«uu«5u oy me tune tne PCU has fetched com- 
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24-3.4 Top-level Description 

A block diagram of the LBD is shown in Figure 113. 



ORAM Interlace Unit 





i 




3 


to 


I 




1 

iread 


con 








1 r-3 


r 


V 



lossless bMevei 
cl«eo<ter unit 



Stream 
Oeooder 



IS 



^pass_thrpt/9h_dot,lenflth 



pass,thrDugh_enabte 



prevL.! Ine^sotirce 



fteglstef and 
Resets 



_ Pnes^femalnina 



gne.tength 



Command 
Controller 



15. 



^Control 



Jbd.flnlshedband 



15 



j>_aQ 



Next Edge 
Unit 



Unit 



lbd,sfu. 



sftij 



data 



1 Idb.i 



datavaSd 



19 



End of Band 
Unit 



pladvwor 



lbd_pldata 



uadvllne 



lbdt.sfu.wdi 



la^ 



wdatavaiki 



Previous 
Line Buffer 



Spot FIFO 
Unit 



Next 
Line Buffer 



Figure 113. Block diagram of lossless bMevei decoder 

The LBD contains the following sub-blocks: 
Table 108. Functional sub-blocks In the LBD 



Registers and 
Resets 



Stream Decoder 
Command Controf/er 



'*««=^'P«°" from «he dram through the 0(U Inter" 
,^n" "'^ "« «'««m a command witf, arflume^ whi,^ it 

then passes to the command controlter. a<«nenB, wnicn it 



Next Edge Unit 



Scans ttirough the Previous Une Buffer using its current address to find" 
fLn^ °f '^'^'^ the command contr<^ler ^e 

If^w. ""T"^ back to ttve^m^nd 

controner and sets a valid tHt when this address is «. .h» o^e 
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■ SfS^s ntSSef '^"^ — ''^'^ writes this dau into *e 
Naming of signals and logical blocks are taken from [18). 
The LBD is able to stall mid-line should the SFI 1 1»- i.noku *^ i 

line frame due to band processing latency '"'^'^ " " ""^i^-* « c"n«nt 

24.3.5 Registers and Resets sub-block description 

Kis. Hie register descripSoils for He LBD «. listtd i "iSikfo? Uieje two regis- 

LBD igooces ,E. p,.viou. lii l.^o^S LSS^^S".?? <^ » l» tte 

line regMdless of «.!», the of theOTJ is ««<«<»" ifit o receiving .11 2er„s f„, fte p,„iom . 

s;,frret^o'ri,rr,s:snst^s^^r'"»'^."°"- 

pressed data stream. cquesnng oata from the DIU and commence decoding of the com- 

24.3.6 Stream Decoder Sub-bJock Description 

SibS?St^otS,r5?6-t; n?r^^^^^ *^ '^•U -cesses of 

the empQ, space cre"td by Z bl^el IJfft ^ '"^"'Z FIFO to fill up 

intoa^^^L.d/a^l.^.^aLSt'lS^^^ 
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A dataflow block diagmm of the stream decoder is shown in Figure 1 14. 




Rfluro 114. Stream decoder block diagram 
24.3.e. 1 OecoaeC . Decode Command 

The DecodeC logic encodes the command from bite f, fl «f fi,- k-. 

mands: SKIP, VERTICAL and RUNLENGTH U^^^f, ! *° °f 

consumed, which feeds back to the ^X^ftl^TJ "^"^ '° ""^^ bits were 

as a medium nmlcngth this tell the Stream S<iS t?a?SceSl^"'' f ^ ' ' 

length is decoded completely the LBD ^tt^SS ^ROUCM T^u containing this run- 

be a number of bits that represent un!^^^, A A. -^^^" '"^''^^g the runlength there will 
all these bits have been luTcSrtSltm ''^'^ '"^^ PASS_THROUcl ^o<ie 

or the line ends, which ever comS *^ P™g«unmed number of bits is reached 

24.3.6.2 DecodeD - Decode Delta 

15 bit number, which is gcne^ly cf„^deredTS^^^^ T""^ ' '° a 
dots for an A4 page a nd 1I488 dL^s foraS';Spa„^(i?3^^^^^^^^ " T'' ^"^^ '^824 
" ™ ^ pa ge (ot 32.768), a 2 s com plement representation of -3 -2 - 
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iwm work correctly for thi data pipeline that follows, unit also outputs how ntaay bits wete con- 
!jist '^t^'^i^fu^'v".?'^'' ''^'^^ ^"^^ ''•^ » ^« un-con^pressed data and 

24.3.6. J State^machine 

DRAM wordlize. P«>g«»m«l so that the distance between them is a multiple of the 256-bit 

rL"^T:oro£ * ^^^^ ^•^^ i-tructions to 

passed, and thJs^Z^Sne^t^S^^T^ In the first instruction fetch, the first run length is 
fetch from the com^cTtroll^oAS^^f^^^^ *^ ''^on 

Command Controller Sub-block Description 



24.3,7 
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24.3.7.1 State machine 



Figure 115. Command contron«r block diagram 



The following is an explanation of all the states that the state machine utUizes 
« START 

li AlVAITJUFFER 

The NEU contains a buffer memory for the data it receives fmm tl,- <!in f wi. .u 
enters this state the iV£:c/ detects this and stamLfflrinT^. T command controller 

state when the state machine in the ^^CSteSiSS lihlSr^. TT^'^'t- '° '''''' 
inand controller can proceed to the PARSE NEUJiUNNlNG sUte. Once this occurs the com- 

iu PAUSEJCC 

due to band processing fSZcTlU^ ofSilTr ^ Additionally the SFU can also stall mid-line 

decoder gets more of ^e coV^eLTd: alt^T ^oTth^S^^"^^^^^^ '''"^•^ ^"^^ ^^^^ 

frames. All of the remaining states check if cX,a^^ »e DRAM or the SFU can receive or deliver new 
decoder) or if. /ujbd S^LsTo «r^ ^d flif^^^^^ ^"^^ ^ ^^"^ of the stream 

command conttolFer entSfo iweve^^^fnd^ ZsnSZ ,1"^ T"""' ''^.'^^-^^ ^ *e state that the 
both asserted and the LBD can recor^encJ^dllTp^ssl^^^^^^ ' '''^"'''^'^ ^'^-^^''-'^ ^ 
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When in this state the command controller can receive one of four valid commands- 
a) Runlengtfa or Horizontal 

Vertical 

W^lc. i, ,.f , - color ft,„ .„ 

element on *e previous line, for a VettfSimZ?™T^ » ae cun^l Une ,s reluive to ihe ohiinging 

c.,respo.d„,H.,w.«„„i,ftL'^s:rtS::;yrpi';rite'°'^'^'"^ 

<^ Skip 

c'^h^S^'t'Tiste^r.e^or^^^^^^^ » <--t line is not 

that the command controller" eaL S^eTref^^^^^^ '''"S^ ^^iP "^"-ands 

the current color in this case. ^fez-ftca/CO) commands and has been coded not to change 

d) Pass Through 

r:^tr2:Xr^roth^„SrS~ - ^•-•^ ^^^-^ t^at is uses to construct 

LBD can recommence nonnalSpSon acZ'Se t^f 1 controlled in the stream decoder, the 
color as the last bit in un-compress^dTu sLfTpS, mS^f ' ""^^ 

command controller as each pass thiouirf. enm,^: J -^t ^^'^ ^^^^ in the 

cessed in one clock cycle. ^ "'^"'^ ^'^^ ^'"'der can always be pro- 

V WAITJP-ORJiUNLENGTH 

clock cycle the command cSler^l to tU^'pn^'ll^Z ?^ ^^^^^"'^ After the first 
LENGTH d^tz has been consumed OnS Si^sL /„H -^^ -^^^^ ^' ^tW- 

controUer will renun to the pSs^e " " «»e command 

W WAITJfORJfE 

"nTinrft^i^r^^r^^ 

remains here until the edge is dcJS Jililn^T u WArT_FORJ^E state and 

reUim to the PARSE " " command controller will 
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v/i FINISH JJNE 




FJgure 116. State diagnm, for the Command Contn,Her (CC) state machine 



24.3.8 Next Edge Unit Sub-block Description 

SFU and it buffers Z ^rcvJot detect,^^^ « ^'^^^^ to 

Controller supplies the current^^. '^'^1 ^ °P«^«0" *e Command 

the end of a length S^St^^ScT ^'^S^' "e 
NEUmU sea«hthc?Lio^lSe SnSfll?,^''^ also supplied and using these two values the 

Command ControDer ^th^iS addrS ! .„ 1 ^ ^ '''^^^ " A^£C/retums this location to the 

tioller that S^'^^n "etit^ ^j^S SL'S ; J ' C°"™-'<' ^on- 

^^C/ operates on 1 6-bit w^^ds tTftls^ss^w fhi *° ^"^^ TTie 
t.scasethe.BUwUl.,u^rre'^^E^:;r^-^^^^^ 
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tinue doing this until it finds an edge or reaches th^ ^nH 




Figure 117. Next Edge Unit block diagram 

24.3.8.1 NEU Buffer 



I11*e te.'S^rv^^^^^^^^ ''-^^ r ^ P'-^ous line and is not delineated 

presents a problem for yi^J^r^Z^.Trl^ a' ^ """^ ^ ^ « '^'^ f™"" SFU- ™s 
ing clement in 2^^^^ " *^ ««> « chang- 

£r*°e"^°ct^i:f^^^^^ 

struct the current fhune of the^umJnT^f >nfonnat.on that .s needed from the previous line to con- 
xcques. for a cur«.nt line is received unti, it is rented and r./^et^ZV:::L S^r^rsts"*"" * 



anew 
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16 



usa.prev_Hne_a ^ yl- 



16 

use_prev_lJne^b <4 — 



pl_bufCrdy- 



1^ 



16 



sfu.lbd.puata 



PLbu«Lrdy_d»y 

Figure 1 1 8, Next edge unit buffer diagram 



24,3.6.2 NEU Edge Detect 
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15 



"sg>prBv_«ne_a 



J [ 



usa_prev_lin8_b 



19, 



trans(tion_wtob 



Jfansftion,fatovy 



19. 



19^ ^<<Qcode_b_eic t & decode.b & FIRS T Pit j wpitc 




FJgura 1 19, Next edge unit edge detect diagram 
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Table 109. Decode^b truth table 







uuuu 


1111111111111111 


UUU1 


1111111111111110~ 


0010 


1 1 1 1 m 1 1 1 1 1 1 1 00 


0011 


1111111111111000 


0100 


1111111111110000 


0101 


1111111111100000 


0110 


1111111111000000 


0111 


1111111110000000 


1000 


1111111100000000 


ICWI 


1111111000000000 


1010 


1111110000000000 


1011 ' 


1111100000000000 


1100 


1111000000000000 


1101 


1110000000000000 


1110 


1100000000000000 


1111 


1000000000000000 



Table 110. Decode^b^ext truth table 







Vertical{-3) 


111 


Vert]cal(-2) 


111 


VerticaJ(-l) 


Oil 


OTHERS 


001 



the fim prc;;^; eTemeS^^^^ 2.2.5 a) in [ISJ .fers to "Processing 

is only used by the NEU if ix is 7ot^!TJ^4t- ^ ^ ^ ^^^'^ "^"^'-^ However it 

asserted at the begin^ng o^a ,ine ' i^/i^^r^iCC is • 1' which is only 



element is coded" ThirmMns A« „™« ITT"^ ^'^'"^"^ *fter the last actaal 
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24.3.8,3 Eneocle_b_one_hot 

block. 1 11 lists the truth table outhning the fimctionally required by this 

J^^'g Encode_b_one_hot Truth Table 







xxxxxxxxxxxxxxxxxxi 


0000000000000000001 


XXXXXXXXXXXXXXXXX10 


0000000000000000010 


XXXXXXXXXXXXXXXX100 


0000000000000000100 


XXXXXXXXXXXXXXX1000 


0000000000000001000 


xxxxxxxxxxxxxxioooo 


0000000000000010000 


XXXXXXXXXXXXX1 00000 


0000000000000100000 


XXXXXXXXXXXX1000OOO 


0000000000001000000 


«xxxxxxxxxiooooooo 


0000000000010000000 


XXXXXXXXXX100000000 


0000000000100000000 


XXXXXXXXX1000000000 


' 0000000001000000000 


XXXXXXXX10000000000 


0000000010000000000 


XXXXXXX100000000000 


0000000100000000000 


XXXXXX1000000000000 


0000001000000000000 


XXXXX10000000000000 


000001 0000000000000 


. >OO(X1000O0000000000 


00001 oooooooooooooo 


XXXI 000000000000000 


0001000000000000000 


XX10000000000000000 


0010000000000000000 


XI 00000000000000000 


0100000000000000000 


1000000000000000000 


1000000000000000000 


0000000000000000000 


0000000000000000000 



;Tfiir:rtr!!^.:ri!l ^at edge tr^ition is 



^ I IS a one-hot" vector that will i 

located. In cases of multiple edges, only the fi«t one will bepcted. 

24.3. a.4 Enco<fe__b_4bit 

SS'Sl^'tLii?^'"°' '^^'^ ^ — *e data to detennine the add^ss 

asserted the bit location in thrvector is S^e^^H^ ^ '^'^ fr'^^- « bit 
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^^o^t '^'"'^ ""'"^ ^ implemented to 



return bip to the command 



for V(n)blp = X n modulusl6 
^o^n^/' ""^^"^ ''"^ ^"'^ — 'o-Hot. vecto. „ ^he vertical 



24.3.8.5 State machine 




Figure 120. State diagram for the Next Edge Unit (NEU) state machine 

TTie following is an explanation of all the states that the A^St/state machine utilizes. 
i NEU_START 

controller has entered t's AW^BUFF^^^" ^ ^^^'"^"^ <=0'nniand 

state. AfVAIT_BUFF state. When this occurs the AffirC/ enters the NEUJFILLJVFF 

a NEU_FILL_BUFF 

?m f its buffer With new data from the 

completed it enters th^NEU^So^^^^ "^"^ ^^"^ P^^^^^"^ Once 

«/ NEUJiOLD 
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T^U waits in this state for one clock cycle while data requited fh,n. the SFU on the last access 
iv NEUJiUNNlNG 

NEU^RUNNING controls the requesting of data from the <2FTi fw. ^i,-. • j r. 
V NEU^EhfPTY 

the LBD. w^craa, i ms occurs when the end_ofJine signal is detected from 
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Line Fill Unit sub-block description 

Lt.X'iLS'iS« JTT™ rrtii?"rr ^^^^^ 

when it has put together a comDlete 1 6 hi^^^ . ^'"''^^^^ Command Controller and 



A dataflow block diagram of the line fiU unit is shown in Figure 119. 



Next 
Edge 
Unit 



command contJt>(ler 



15, 



hokl_sd_CQloi' 



vmlftus.zaro 



Stream 
Decoder 



command 



detta 



15/ 



MacMne 



lino flu unit 



4 Onvt 



Ifu.state 



color.seLiebtLM 



'16 



16^ 



ttne.fiD.data 



wwk_sfu_wdata 



IK lbd_sftj_wdata 
¥\ 



lbd_sfu,advfin^ 



Figure 121. Line fifl unit block diagram 

The dataflow above has die following bloclcs: 
24.3.9.1 State Machine 

The foUowing is an explanation of all the states that d.e LFU state machine utilizes. 
i LFUJSTART 

This is the state that the LFU enters when a hard < 



or soft reset occurs or when Go has been de-asserteH 



This state can not left until the 
longe 
NEU, 



LFUJ^EWJiEG 
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dua 10 Ite SFU wilh the wrt» ml^. d™i n ■» completej Ihe LRJ will output the 

st-e tnachin. «T^,XX<^U ^J^^^^ "°' T^^'^ " 

^^^«.^.;h^i.a.he,*.^!riS'^s,rj:s::^^^ 

Hi LFU^COMPLETEJIEG 

n^^r^^^u u^' u J l/T ^°°^P^^^fi<l a^id the data can be wntten to the SFU In the case of the 



— I.1WP TFNfTDa 




Figure 122. State diagram for the Line Fill Unit (LFU) state machine 

24.3,9.2 line^rilLdata 

jo^^h^. h, the ,.uej„.^ '^7Jr-SrrJJ;4t°S Wot-iTpS t 

l^^^-^T''^'^ LFU^START) OR (Ifu.state == LFU_NEW REG) then 
worH_sfu„wdata • color^sel 16bit If - « cjien 

else " — 

WQrH_efu,wdat:at(15 - llinic) downto limit] = 

color^sel_16bit_lfr(l5 - limit) downto limit) 
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25 Spot FIFO Unit (SFU) 

25.1 OVERVJEW 

affecting either the LBD or HCU ? ^1^^, / ''"f '^.^'^ f^^^ or decreased without 

the SFU so that the w to^fte HCU 1° ^^^^^'^ 'li«^»'0'^ by 

both the horizon^! ^XrtcS dfrSoT-S^ T'"^*"^ N<»-integer scahng is supported m 

but may be progran^Jed to^elSt ^ will be the same in both directions 

25.2 Main features of the SFU 

TbeSFU replaces the Spot Line Buffer Interfece (SLBI) in PECl. The spot line store is now located in 

width of 1 6 bits. The SFU mttrSS^^HCU S^^^^^^ ^ SFU with a data 

l':^£:i^m:s':^tt^'^'^^,^ Tv^''. - -"l^e of 16 bits, a capability to 

fo«. SFU reaTo^D^jL t^SSS^I tnd of 'J.^^ "'k '''^ ^'^^ '«<»"^'-<'- T^^^^- 

padded. °^ * fi" *e DRAM word, will already be 

Ibd^Ju advlineZbch^rXe^^ i'"' ^ ^ "^PP"*** *e first 

that-te SFuT^vSlS" Z "t^^^ lH^^ 3 t'S^ t^^^"^"^*" '^^^ 

j> z^/oMi'o^/ tells the SRI fn cLJi 7^ !2 J advance to the next line. 

SFU is available for both reading and writ^rThlt^^ftl, ^^"^.^"^ ^fi*Jdb_rdy indicates that the 
LB03houldnotge„e.ate...iXrSo^Sir^^^^^^ 

Acu 5^ un"f "a^'w^^^^^^^ "^-^t g-erate'lLe 

*A_Acw_avai7sign5. ^-''"'-awi/ is true. The HCU can therefore stall waiting for the 

X and Y non-integer scaling of the bi-level dot data is performed in the SFU 
^Uft^-'^SStec^^^^^ 

256 cycles. A single DIU readTnterfa^e Su bl T ' ^'^^^'^ ^^^-^ 

DRAM. B " rcaa mienace will be shared for reading the current and previous lines from 
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25.3 Bl-LEVEL DRAM MEMORY BUFFER BETWEEN LBD, SFU 



AND HCU 




high address 

lbd_nextline_adr 

^ fbd_prevline_adr 
^ hcu_read»ne_adr 
hcu_startreadline,adr 

low address 





high address 
Ibd^extlina.adr 

hcu_readlfne_adr 
^ lbd_prevline_adr 
> hcu_startreadljne_adr 

low address 



(b) 



(a) 

Key: [ | Free buffer space 

buffer space accessed by LBD Interface FIFOs 
^3 F''"ed Buffer space read by HCU Read Line FIFO 

n ™^BufferspacereadbybothHCUReadUneFIFOandLBDInterfaceFIFOs 

Figure 123. BMevel ORAM buffer 

The SFU interfaces to DRAM via three FIFOs: 

a. The HCUReadLineFlFO which supplies dot data to the HCU. 

b. The LBDNextLineFIFO which writes decompressed bi-level data from the LBD 
cThe LBDPrevLineFTFO which reads previous decompressed bi-leve! data for the LBD. 

There are four address pointers used to manage the bi-level DRAM buffer: 

».hcu^readline_adr[21:5J is the read address in DRAM for the HCUReadLineFIFO 
"'tiTv7:SirnfFPa'^ ^'^^ ™ ^ead by 
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c. lbd^line_adrpi:5J is the write address in DRAM for the IBDNextUneFIFO 

d. lbd^revline_adrpi:SJ is the lead address in DRAM for the LBDPre.LineFIFO ' 
The address pointers must obey certain niles which indicate whether they arc valid- 

'1^d-7^l!^-^!^IH-^°''r ^'^'^ '^""^ - 4e line than 

b.TheSFU cannot overwrite the current line that the HCUUr««i,n«*v«™:« u,r j . . 
lbd_nex,line_adrpi:5J /= hcu_^arneadl^^_al/2ISJ ^ Mf_,tartadryalid - 

cThe LBDNextLineFIFO must be writing earlier in the line than LBDPrevLimFIFO « 

•.At stamp i e.w^^ aj,/2y.5, -n^e&st 

S^trif^^aij^^rair^"^-^"--'*^^^^^^^ 

f. The address pointers can wrap around the SFU bi-Ievel store area in DRAM 

As a guideline, the typical FIFO size should be a minimum of2 lines stored in nRAVf „ • ., , ,• 
up to a programmable number of lines. A larger buffer Xws UnTtoT^.^^ 1? ""I"*'^ ^ 
can be useful for absorbing local complexiti^in ^v^Zx^A^^^^';''^'^'^ " 

DRAM ACCESS REQUIREMENTS 

itt^^For'" ' the 

ll^e%r;lnS^e%TbTDlivM^ 

vious. cuxrent aSd n^l ite m^^^""' '"^^ ^^^'^ '^^'^ ^-^-^ f°--h of its p^L 

The SFU's DIU bandwidth requirements are summarized in Table 1 12. 



Table 112. DRAM bandwidth requirements 




1: Two separate reads of 1 bit/cycle. 
2: Write at 1 bit/cycle. 



25.5 SCALING 
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i3 



if (count + denominator >= niamorator) then 
count = (count * denominator) - numerator 
advance = i 

else 

count «5 count ♦ denominator 
advance s 0 



X scaling controls whether the SFU supplies the next dot or a copy of the cunent dot when the HCU 
^rts hcu_sfu_advdoi. The SFU counts the number of Hcu_sJu_a2L signalsI^S ttie HCU iJjen &e 
SFU has supplied an entire HCU line of data, the SFU will either re-read L cunent line from DrZ or 
advance to the next line of HCU read data depending on the programmed Y scaSSo!^ 
An example of scaUng for numerator - 7 and denominator = 3 is given in Table 1 1 3. The signal advance if 
asserted causes the next mput dot to be output on the next cycle, otherwise the same input dSs ou^m 

Table 113. Non-Integer scaling example tor scaleNum = 7, scaleOenoro s 3 



mm 






0 


0 


1 


3 


0 


1 


6 


1 


1 


2 


0 


2 


5 


1 


2 


1 


0 


3 


4 


1 


3 


0 


0 


4 


3 


0 


4 


6 


1 


4 


2 


0 


5 



25.6 Lead-in and lead-out cupping 

Se'X'^? case where there may be two SoPEC devices, each generating its own portion of a dot- 
So?E? T^ h1» n** 1"^ r '""^ scale-factor number of times by an individual 

Its S JJJIh t." ""^'^t ^l^^^f-^ ^"^y both devices doing part of the Scaling, one on 
I?n«h wm Z «V dots on the lead-out. i.e. which go beyond the HCU line- 

^tt::;'i^:^£r^'^"'' - '^-^ ^^^^^ «^ ^ ^ « ^^^^'-c^ ^y settmg 

"1^ P^'^"«*°^o<»« ^ove is set to J&r«rrC«>««. If there is no lead-in. Xuart- 

set to the appropnatc value of count in the sequence above. 
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25.7 Interfaces between LDB, SFU and HCU 



LBO 



U)d_sfu_piac /word 



^shj_»bd_ptd ital§ 



tod_sfu_w<± ta 16^ 



lt>d_sfu_vwli tavalid 



lbd_sfu_aclv rne 



DIU 
A A 



DIU InterfacQ 
and 
AcMress 
Generator 



Previous Una 
FIFO 



pILfdy 
nlCftfy 



NextUne 
FIFO 



Currem Una 
FIFO 



hc»j .sfu^advdot 



1 sfu. 


.hcu^sdata 


sfu. 


hcu^avan ^ 



SFU 



HCU 



Figure 124. Interfeces between LBO/SFU/HCU 

25.7.1 LDB.SFU Internees 

2S.7.i.1 LBDNextLlnoFIFO Interface 

'^\^^^^'%^^f"'Om^^ from the LBD to the SFU comprises the followmg signals: 

• lbd_sju_wdata, 16-bit write data. j»*6"ai*. 

• lbd_sfu_wdatavalid, write data valid 

• lbd_sfu_j2dvline, signal indicating LDB has advanced to the next line. 



25.7.1.2 LBDPrevUneFIFO Interface 



^\^^TTJ''Z"^VJ'^'^^'^ ^^--P"^" fo'lo'^^g signals: 

• sfiijbd^ldata, 16-bit data. 

^''\'''td'^J!"j read buffer interface from the LBD to the SDU comprises the following signals: 
lt>d^sfi4_pladvword, signal mdicating to the SFU to supply the next 1 6-bit word. 

• Ibd^fii^advline, signal indicating LDB has advanced to the next line. 
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Previous line data is not supplied until after the first Ibdjsfii^advline strobe from the LBD (zero data is 
supplied instead). The LBD should not assert ibd_jsfii^ladyword unless sfiijbd^rdy is asserted. 



sfiijdb^rdy indicates to the LBD that the SFU is available for writing. After the first lbd_sfii advline and 
before the number of Ibd^sju^ladvword strobes received is equivalent to the LBD line length, 
sfujdb^dy indicates that the SFU is available for both reading and writing. Thereafter it indicates the 
SFU is available for writing. 

The LBD should not generate Ibd^fii^ladvword or lbd_jgjuj^vline strobes until sjujdb_rdy is asserted 



The interface from the SFU to the HCU comprises the following signals: 

• sfujicujsdata, 1 -bit data, 

• sjujicu^avail data valid signal indicating that there is data avaUable in the SFU HCUReadLine- 
FIFO. 

The interface from HCU to SFU comprises the following signals: 

• hcu_sfu_advdot, indicating to the SFU to supply the next dot 

The HCU should not generate the hcu^s/u_advdot signal until sjujicujavail is true. The HCU can there- 
fore stall waiting for the sjujicu^avail signal. 



25. 7. 1 3 Common Control Signals 



25.7.2 SFU-HCU Current Line FIFO Interface 
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25.8 Implementation 

25.8.1 Definitions of K) 



Table 114. SFU Port List 



1 Clocks and Resets 
1 Pdk 

1 prsxjn 

1 wiu neaa inierface signals 




In 

In 


SoPEC Functional dock. 
1 Global reset sfgnaJ. 


1 sfu_cUu_rreq 




Out 


SFU requests ORAM read. A read request must be accom- " 
panled by a valid read address. 


1 sfu_diu.fadrt21:51 


17 


Out 


Read address to OIU 

17 bits vmie <256-bit afigned word). 


j dfu.sfu.rack 




In 


Acknowledge from DIU that read request has been 
accepted and new read address can be placed on 


1 diu_data(63:0} 


64 


In 


Data from DIU to SoPEC Units. 
Rrst 64-blts are bits 63:0 of 256 bit wrd. 
Second 64-bits are bits 127.'64 of 256 bit word. 
Third 64-Wts are bits 191 :1 28 of 256 bit word. 
Fourth 64-bits are bits 255:192 of 256 bit word. 


1 dfu.sfu.rvafid 


1 


tn 


Signal from OIU telling SoPEC Unit that valid read data is on 
the diu^data bus. 


1 Dill Write Interface signals 




1 sfu_dlu_wreq 


1 


Out 


SFU requests DRAi^ write. A write request must be accom- 
panied by a valid write address «>gether with valid write data 
and a write valkl. 


1 8fu_dfu_wadr(21:51 


17 


Out 


Write address to DtU 

17 bits wide (256-blt aligned word). 


j dru.sfii.wack 


1 


In 


Acknowledge from DIU that write request has been 
accepted and new write address can be placed on 
sfu_diu__wadr. 


1 sfu_diu_data(63;0j 


64 


Out 


Data from SFU to DIU. 
Rrst 64^)its are bits 63:0 of 256 bit word. 
Second 64-bits are bits 127:64 of 256 bit word. 
Third 64-bits are bits 191 :128 of 256 bit word. 
Fourth 64-blts are bits 255:1 92 of 256 bit word 


1 $fu_diu_wvalid 


1 


Out 


Signal from PEP Unit indteating that data on sfu diu data Is 
vafid. " ~ 


Kuu Interface data and control signals ~ 


pcu_addr[5:2] 


4 


Jn 


PCU address bus. Only 4 bits are required to decode the 
address space for this block 


pcu_dataout(31:0I 


32 


In 


Shared write data bus from the PCU 


8fu,j>cu^datain[31:0I 
pcu_rwn 


32 
1 


Out 
In 


Read data bus from the SFU to the PCU 


pcu.sfu.sel 


1 


In 


^ommon read/not-write signal from the PCU 

Block select from the PCU. When pcu^sfu^se/ is high both 
pcu^addr and pcu^dataout are valid 
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Table 11 4. SFU Port List 



LBD Interffacq Data and Control Signals 



Ready signal to the PCU. When sft/_pai_/dy Is high It Indi- 
cates the last cycle of the access. For a write cyde this 
means pcu_dataout has been registered tiy die t>lock and 
for a read cycle this means the data on sfu _pcu datsdn is 
valid. * 



sfujbd_fdy 



lbd_sfu__advline 



lbd_6fu_i>ladvword 



Out 



In 



Signal indication that SFU has previous line data availaWe 
and is ready to be written to. 



Une advance signal for both next and previous lines. 



Advance word signal for previous line buffer. 



sfujdb_pldata[15:0] 



16 



Out 

In 



Data fronn the previous iJne buffer. 



Ibd,sfu^wdata[1 5.-0J 



16 



lbd_8fu_wdatavalid 



Write data for next line buffer. 



In 



HCU Interface Data and Control Signals 



Write data vaUd signal for next line buffer data. 



hcu_sfii_advdot 



sfu_hcu_sdata 



sfo_hcu_avall 



In 



Out 



Out 



Signal Indicating to the SFU that the HCU is ready to accept 
the next dot of data from SRJ . 

Bi-level dot data. 



Signal indicating vafid bMevel dot data on sfu_hcu_sdatsL 
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25.8.2 Configuration Registers 

Table 115. SFU Conffguratlon Registers 




0x04 



Go 



Setup registers (constant for duf Ing processing the page) 



0x1 



0x0 



A write to this register causes a reset of 
the SFU. 

This register can be read to indicate the 
reset state: 

0 • reset in progress 

1 • reset not in progress 



Writing 1 to this register starts the SFU. 
Writing 0 to this register halts the SFU. 
When Go is deassertecf the state- 
machines go to their idle states but all 
counters and configuration registers keep 
their values. 

When Go Is asserted all counters are 
reset, but configuration registers keep their 
values (l.e. they don't get reset). 
The SFU must be started before the LBO 
is started. 

This register can be read to determine if 

the SFU is running 

(1 - running. Q - stopped). 



1 0x08 
OxOC 


HCUNumDots 
HCUDI^MWords 


16 
8 


0x0000 
0x00 


Width of HCU fine (in dots). ~ 

Number of 256-bit DRAM words In a HCU 

line. 


0x10 


LBDNumWords 


12 


0x000 


Number of 16-brt words in an LBD line. 
(LBD Itne length must be a multiple of 16 

bits). 


j 0x14 


StartSftjAdr(21;5J 

(2S6-bit ailgned DRAM address) 


17 


0x0000 
0 


Rrst SFU location in nienK>ry. 


1 0x16 


EndSfuAdf(21:5J 

(256-bit aligned DRAM address) 


17 


0x0000 
0 


Last SFU location in memory. 


1 OxIC 


XstartCount 


6 


0x00 


Value to be loaded at the start of every line 
into the counter used for scaling in the X 
direction. Used to control the scaling of the 
first dot in a line. 

This value will typically equal zero, except 
in the case where a number of dots are 
clipped on the lead in to a line. 


1 0x20 


XscaleNum 


8 


0x01 


Numerator of spot data scale factor in X 
direction. 


0x24 


XscaleOenom 


6 


0x01 


Denominator of spot data scale factor In X 
direction. 


1 0x28 


YscalGNum 


a 


0x01 


Nun^erator of spot data scale factor in Y 
direction. 


YscafeDenom 


8 


0x01 


Denonunator of spot data scale factor in Y 
direction. 
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(256'bit aligned ORAM address) 


17 




Current address pointer In ORAM to HCU 
read data. Read only register. 


0x34 


HCUStartReadUneAdr(21 :5) 
(256-blt aligned DRAM address) 


17 




Start address in ORAM of line being read 
by HCU buffer fn ORAM. Read only regis- 
ter. 


0x36 


LBDNextUneAdrJ21 :S] 
(256-bli aligned ORAM address) 


17 




Current address pointer In DRAM to LBD 
write data. Read only register 


0x3C 


LBDPrevUneAdrf21 :SJ 
(256-bit aligned ORAM address) 


17 




Cun^ent address pointer in ORAM to LBD 
read data. Read only register 
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25.8.3 SFU sub-block partition 




pou.addrr5:2i 
pc«j.dataout(3i:0j 



LBD 



stu 



lbd_rdy 
4 



PCU 
Interface 



IB - 


8fu^ (to as 8ijb*bk>d(s) 
hcu_num^C0t8 ^ 




hou_<frajn_word9 


p 




t>d_num_mofds 


— — » 






► 




stait_sfij_adf 




" y 


efKl_sfu_adr 


» 


4 


hcu_reatJtlne_adf 


» 


4 


hcti_6tartr©adt!ne_a tfr 





4 


Ibd^nextflne^adr 




^ 


# — 


Ibd_prevnne_adf 


^ 


xstart_count q 




xscale. 




^ 


xscale^denom P 


yscalo_num § ^ 




1' 


» 

' ^ 



lbd_sft _pl8(fvword 



lbd_nurn_words 
12. 



LBD Previous 
Line FIFO 



lbd_sii_wdata 16 



.s u.wdatavalid 



HCU 



hcu. 



a 



jitf.diurreq 



plf^dhjrack 



ptLddirdata ^4 



ptf^drurvafid 



lbd_&fu_advrina 



fbd.nuni_wofxfe 

'4^ . 



LED Next 
Line FIFO 



ntf_diuwrdq 



nrf_diuwack 



nIf.dUjwdata 64 



ntf.dfuwvalkl 



sfu^advdot 



^ sfu 


.hcu.sdata 1 


4 


-y- 

-hcu^avaU 



SFU 



hrf_bcu_endofnr>6 



hrf_xadvance 



HCU Read 
Line FIFO 



fuf^dlurroq 



hrf_diurack 



hrf^dturdata ra 

H — ^ 



hrf.djurvafid 



hrf_diuldte 



DIU 

Interface 

Address 
Generator 
Unit 
(DAG) 




► sfu^dJu_wreq 
►5fu_dlu_wadft21:5) 
' sfu.diu_data[S3:01 
*■ sfu_diu_wvand 
-dJu_8fu_wack 



" sfu_dlu_rreq 
►6fu_diu_rad421:5) 
' diu_8(ujdata{63:0] 
■dlu_sfu_rva«d 
-diu.6fu.racK 



Figure 125. SFU Sub-Block Partition 
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The SFU contains a number of sub-blocks: 



PCU interface 


►^ou inienace. configuration and status registers. Also Generates the Go 
and the Reset signals for the rest of the SFU 


L6D Previous Line 
FIFO 


Contains FIFO which Is read by the LBD previous line interface. 


LBO Next Lfne FIFO 


Contains FIFO which is written by the LBD next Une interface 


HCU Read Une 
FIFO 


Contains FIFO which is read by the HCU interface. 


OIU Interface and 
Address Generator 


Contains DIU read Interface and DIU write interface. Manages the 
address pointers for the bi-level DRAM buffer. Contains X and Y scaling 
logic 

: 



25.8.4 



The ^»*« "FO sub-blocks have no knowledge of where in DRAM their read or write data is stored. In 
^.s sense the FIFO sub-blocks arc completely de-coupled from the bi-level DRAM buffer AiTd^ 
address management ,s centralised in the DIU Interface and Address Generation sub-block. DRAM ac^ 

to wnte a DIU access will be requested immediately. This ensures there are no unnecessary stalls intro^ 
duced e.g. at the end of an LBD or HCU Une. 

There now follows a description of the SFU sub-blocks. 
PCU Interface Sub-block 

t?L^sFu1SS^ s"S;JI"' '""'^ "^'^ ^^^^^^ ^^^^ -^^^ - 
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25.8.5 LBOPrevLineFlFO sub-block 



Table 116; LBDPrevLineFIFO Additional lO Definitions 



internal Output 



OIU and Addrese Generation sub 


"block Slgj 


Out 
nala 


Signal Indicating LBOPrevLineFlFO is ready to be read 
from. Until the first ibd_sfu_a€MiheU3r a band has been 
received and after the number of £bd_sft/^/atfwofd strobes 
received for a line Is equal to LBDNumWtirds, pif ntyls 
always asserted. During the second and subsequent lines 
ptLniy iB deasserted whenever the LBDPmvUneFIFOls 
empty. 


plf^diun'eq 


1 


Out 


Signal Indicating the LBDPmUneFIFO has 256-blts of data 
free. 


pMLdiurack 


1 


In 


Acknowledge that read request has been accepted and 
ptLdiurreq shoutd be de-asserted. 


pILdiurdata 
ptf.djunvafid 


1 
1 


In 
In 


Data from the DIU to LBOProvUneFIFO, 
First 64-bits a/e bits 63:0 of 256 bit word. 
Second 64-bits are bits 127:64 of 256 bit word. 
Third 64-bits are bits 191:128 of 256 bit word. 
Fourth 64-bit8 Is are 255:192 of 256 bit word. 


plf,diuidle 


1 


Out 


Signal Indicating data on pfCdiurdata is vaRd. 

Signal indicating DIU state-machine is in the IDLE state. 



25.8,5. 1 General Description 

^r.TlZ^''^n^^ sub-block comprises a double 256.bit buffer between the LBD and the DIU Intcr- 
1 u^^.x^"''^'^'^' sub-block. The FIFO is implemented as 8 times 64.bit words The H^^^^ 
wntten by the DIU Interface and Address Generator sub-block and read byTe LBD 
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i3 



LBD 



/I 



16 



word_5etect> 



Ibd.sfu^ladVword 



'bd_8fu_advtme 



(bd_nuni.word3l2 



pif^fdy 



64 



8 word 
64-bit FIFO 



read 



wnte 



ZEFK) 



' wrtte^adr 



64 



-pILdlurdata 



FIFO control 
logic 



plf_djurreq 



pflLdlurack 



^ ptf_dhifvalld 



Figure 126. LBDPrevUneFifo Sub*block 

^^ITZ"^ ^n'^'''^ are free the FIFO wiU request 256.bits of data from the DIU Interface 

I'rJ^'^ jl^"*™ ™° ^^'^ Plf-d'urdataf63:0J over 4 clock cycles The sienal 

the FIFO wnte eiiablc.M'rire_e/i, and to increment the FIFO write address write ^drnoi ifthwan 
/Tevi/neF/FO still has 2S6-bits free then pl/^Jiun^ should be^^^S- " 

S^rSiSi-i^'^Ji"''' Generation sub-block handles all address pointer maimgement and DIU 
interfecing and decides whether to acknowledge a request for data from the FIFO. 



pclk I 
, plf_diurreq 
pl^diurack 
pl^diurvalid 
plf_diurdata(63.0] [ 




Figure 127. Timing of signals on the LBDPrevLlneFIFO Interface to DIU and Address Generator 

The state diagram of the LBDPrevLineFIFO DIU Interface is shown in Figure 128. If ^A_eo is deasserted 
then the stat(>.fn9rh«n<» i-^tiimc :#o .-J/- - * i^o, 11 ^ju^o IS aeassened 
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res6t=o 



diuidre = 1 



256-bitaffftftfn RFO 
^ Request^ < 



diurreqs l,diuldle»0 



diurreq » 0 



C 

c 



DataO 



3 



Oatal 



cllurvali<l==l 



^ Dataz 



— ^ Data3 ^ 

Figure 128. Timing of signals on LBOPrevLlneFIFO interface to DIU and Address Generator 

li'J.i^rl/^^J^'^u ."^ LBDPrevLineFIFO on sju lbd_pldcUa[l5 0] 

It^Tfl f*"" /i»OPr«v£/«eF/FO to supply the next l^lt w^^d. The FIFO 

control logic gen«ates a signal M>ordjelect which selects the next I6-bits of the 64-bit FIFO word to ouT 
K Z '^r'^'^-P^^P^ OJ. When the entire current 64-bit FIFO word has^ by Se LBD 
lbd_^j,lad>;word wUl cause the next word to be popped from the FIFO. "y the LBD 

Previous line data is not supplied until after the first lbd^_advline strobe from the LBD after sJu_go is 

tZ^J^ZoT^VZl^or^-^- ^- ^-^ 

Ibd^^nl^^I'T^"^^ '"^^ f " * P^<^^ord_count[ll:Ol to counts the number of 

S^ifi^iSSr^H H^/^'T'* "-^^ '^^ /.Wwvorcf.coun, counter is reset to 0 by 

^DNuXZ. """"^ /W_,A^/^o„/ strobes received is eq«.l t^ 

Ihdsl'^I^^^'^? " ''^^-"'•^ '° •« •'^ available. Until the firet 

lbd_?At_advhne for a band has been received and after the number of lbd_fJu_plad^ord strobes reccivS 
for a hne . equal to LBDNumWords, plf_rdy is always asserted. During *^^;f^;ra^?s7sSu^ruS 
p'/.r^^V IS deasserted whenever the £BZ)PrevZ,meF/Fa is empty. » oscqueni imes 

^clTort^lT'^ 1"?" ^"^^ P^*"°8 ^"^l* should not be output to 

of the nuJbt of f-^'"~J'''J'^^'^ ""'^^'y ^ 256-bit DRAM word. When the count 

? Lr^T tbd_sJu^ladvword strobes received for a line is equal to Ibd mm words the LRnPr.. 
UneFIFO must adjust the FIFO read address to point to the next 256-bit ^o'l^^t^^^lo Zs 
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if rpladvword_count lbd_nunL.words) then 
read_adrll:OJ c boO 
read_adrt2] = -reacSLadr [2 J 



25.8.6 LBDNextUneFIFO sub-block 



LBDNextUneFIFO Interface Sign 
OIU and Address Generation sufa 


als 
1 

HblocfcSig 


m 

Out 
nals 


1 agnal indicating LBDNextUneFIFO Is ready to be writlen to 
1 I.e. there is space In the FIFO. 


ntf.diuwreq 


1 


Out 


Signal tndteating the LBDNextLmeFtFOhas 2S6-bits of data 
for writing to the DIU. 


nlLdiuwack 


1 


In 


Acknowledge from DIU that write request has been 
accepted and write data can be output on nff diuwdata 
together with nILdiuwvatia. 


ntf_di4jwdata 
nJLdtuwvalid 


1 

1 


Out 
In 


Data from LBDNextUneFIFO to DIU Interface 
First 644>its Is bits 63:0 of 256 bit woni 
Second 64-Wts is bits 127:64 of 256 bit word 
Third 64-blts is bits 191 :128 of 256 bit word 
Fourth 64-b(ts Is bhs 255:1 92 of 256 brt word 
Signal Indicating that data on wif diuwdata is valid 



25.8. 6. i Genera! Description 

f S^'''^?^^^ sub-block comprises a double 256^bit buffer between the LBD and the DIU Inter- 

S^ne^ bv iJ^^Rn 'T'^ u ""'^^ ^ implemented as 8 times 64-bit w^Jl -fhVHTO 
wntten by the LBD and read by the DIU Interface and Address Generator. 
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sfu^wdata.reg 



•3tf_srij_wdata . 



16 



64 



tod_sfu_wvaDd 



nlf_rdy 



64 

-7^ 



8 word 
64.bit FIFO 



write 



waf(f.8aroci y '2 wtte^adr 



read 



wrilo.on 



/ ^3 read^adlr 



64 

" ► nM^dluwdata 



FIFO control 
logic 



ntf_diuwreq 



nff.dhiwack 



ntLdluwvaRd 
► 



Figure 129, LBDNextLlneRfo Sub-block 

Whenever 4 locations in the FIFO are fiill the FIFO wiU refliip<:t 'y^f^y.x,. «f ^ * • u 
Intcrface and Address Generator bv ^^^^.^alifA f 256.bits of data to be wntten to the DIU 

has been accepteZd 2^2^^^^ '^^.^ nlf^diu^ack indicates that the request 

DIU Interfaced. f^^^.Z ^r l:^^^^^ is sent to the 

that the data on nlf dLZ^fis-S^tt^^^ l^^^'^ff"' '^'^ nlf_diuwvalid indicates 

after nlf_diu^ack. (?tSD^«.fi^^''s^u^^ ^"-"-^ '^^-'=y 

asserted agaia ^ exu^inerifu StiU has 256-bits more to transfer then nl/_diuwreq should be 




Fiflure 130. Timing of signals on LBONextUneF.FO Interface to DIU and Address Generator 

J^errstts:::!™-^-:^^^^ 
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fesetsa=g 



— ^ Idle ^ 



256-brta in FIPQ 



^ Request ^ diuwreq = 1 
^ Ack ^ diuwreq aO 



^ DataO ^ 



DataO J diuwvalidsl 



^ Datal ^ <fiuwvalklsi 



^ Data2 ^ 



Oata2 ) dluwvaiid^l 



^ Data3 ^ 



Data3 ^ 



Figure 131. LBONextUneHFO DIU Interface State Diagram 

^V!fTi'f-^-A^}'"^^''^^^f i5^A'^r£//.«/7FO has space for writing by the LBD. The LBD 
^llf Z ^ lbd^sJu_y,data[15:0J. lbd_sju_wvalid indicates that the data vaMd 

The data is collected to make up a 64-bit word before being written to the FIFO. 

The LBDNextLineFIFO control logic counts the number of lbd_sfu_wvalid signals. The Ibdsju wvalU 



25.8.7 sfu_ibd_rdy Generation 



^Dl^ltJlFO^^ " generated by ANDingp//_rrfK from the LBDPrevLineFIFO and n//_rrf>. from the 

ftJS-'f- "^l*** ^ ^'^^ ^ fo'- i-e- there is space available in the 

LBDNex.L,neFIFO. After the first lbd_,fu_adytine and before the number of W^^r7s4obes 
received .s equivalent to the line length. sjujdb_rdy indicates that the SFU is available for both reading 
Le^ there .s data m the LBDPrevLineFIFO. and writing. Thereafter it indicates the SFU is availablefo; 
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25.8.8 



LBO^FU Interfaces Timing Waveform Description 

u« d«. „=.d »o„ sfJZ^V^^^ ° 

The main points to note from Figure 132 are- 

spaces in the SFU FIFO) ^ ^ ^® rcmainrng 

.^.^r""' ""^"•^ "^"^ for sfuJbd_„iy;ot 2seSeda^:if ' 

o^^^^y^^"^^,^^^^ ]'^'> '^'^J on clock cycle 8 it starts 

SFU in clock cycfe "^^fi'-^d^tavahd and putting new data out which is registered by the 

2uo irbe^aSS.^"" '^'-'^^^ -•^ch should be highlighted. On exanunation this turns 
Scenario I : 

Scenario 2: 

sju Jbd_r^ will go low when there is stUl 1 piece of data in the FIFO If there n« lhrt.fi , ^ ' ^ 

sjujbd_pldcua[15:0j. V^-^ao^ wiu assert again, and so the data wUl appear on 

Scenario 3: 
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Figure 132. Signal waveforms between LBD and SFU 
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25.8.9 HCUReadLineFIFO sub-block 

Table 118. HCUReadLineFIFO Additional 10 Definition 











Did and Address Generation sut 




nri_j(auVance 


1 


fn 


signal from horizontal scaling unit 
1 - supply the next dot 
1 - supply the current dot 


hrLhcuendoffine 


1 


Out 


Signal lasting 1 cyde Indicating then end of the HCU read 
Une. 


hrf.diurreq 


1 


Out 


Signal Indicating the HCUReadUneFIFOhas space tor 256- 
bits of DIU data. 


h/f.diuraok 


1 


In 


Acknowledge that read request has t>een accepted and 
hrf_diurreq should be de-asserted. 


hif.dlurdata 


1 


In 


Data from HCUReadLineFIFO to DIU. 
Rrst 84'bits are bits 63:0 of 256 bit word. 
Second 84-bits are bits 1 27.-64 of 256 b!t word. 
Third 64'blts are bits 1 91 :1 28 of 256 bit word. 
Fourth 64-bits are bits 255:1 92 of 256 bit word. 


hrf^dfurvafid 


1 


In 


Signal Indicating data on plLdiuftlata is valid. 


hrf.diuidle 


1 


Out 


Signal Indicating DftJ state-nnachine is in the IDLE state. 



25.8.9.1 General Description 

Tbc HCUReadLineFIFO sub-block comprises a double 256-bit buffer between the HCU and the DIU 
Interface Address Generator sub-block. The FIFO is implemented as 8 times 64-bit words. The HFO 
ts wntten by the DIU Interface and Address Generator sub-block and read by the HCU 



LBD 
«fu_>wu_G(tata '4- 



b(t_&er8ct 



jsfu^hcu.avaU 



hcu_sfu.advdot 



hcu_nufn_dots 16 



hr1_xa<tvance 



hrf.hcu.endofllne 



64 



8 word 
64-bit FIFO 



read 



read_adr 



write 



writo_en 



64 



Ldhjrdata 



/'3 



writo_adr 



FIFO control 
logic 



hrf_diurreq 
hrf.dlurack 



hrf_diurvagd 



Figure 133. HCUReadUneFifo Sub-block 
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The DIU Interface and Address Generation (DAG) sub-block interface of the HCUReadLineFIFO is iden- 
tical to the LBDPrevLineFIFO DIU interface. 

Whenever 4 locations in the FIFO are free the FIFO will request 256-bits of data from the DAG sub-block 
by asserting hrf^diurreq. A signal hr/_diurack indicates that the request has been accepted and hrfjiiiureq 
should be de-asserted. 

The data is written to the FIFO as 64-bits on hrf_diurdata[63:0] over 4 clock cycles. The signal 
hrfjdiurvaiid indicates that the data returned on hrf_diurdata[63:0] is valid. hrf_diurvalid is used to gen- 
crate the FIFO write enable, write^en, and to increment the FIFO write address, write jadr [2:0]. If the 
HCUReadLineFIFO still has 256-bits free then hrf^diurreq should be asserted again. 

The HCUReadLineFIFO generates a signal s/u_hcu_avai! to indicate that it has data available for the 
HCU. The HCU reads single-bit data supplied on sfu_hcu_sdata. The FIFO control logic generates a sig- 
nal bit^elect which selects the next bit of the 64-bit FIFO word to output on sjujicu_sdata. The signal 
hcujsfit^advdot tells the HCUReadLineFIFO to supply the next dot {hrfjcadvance = 1) or the current dot 
{hrfjcadvance — 0) on sfujicu_jdata according to the hrfjcadvance signal from the scaling control unit in 
the DAG sub-block. The HCU should not generate the hcu ^Ju^advdot signal iintil sju_hcu_avail is true. 
The HCU can therefore stall waiting for the sju_hcu_avail signal. 

When the entire current 64-bit FIFO word has been read by the HCU kcu_s/u_advdot will cause the next 
word to be popped from the FIFO. 

The last 256-bit word for a line read from DRAM and written into the HCUReadLineFIFO can contain 
dots or extra padding which should not be output to the HCU. A counter in the HCUReadLineFIFO, 
hcuadvdot_countfl5:0J, counts the number of hcu_sfu_advdot strobes received from the HCU. When the 
count equals hcu_num_dotsfJ5:0J the HCUReadLineFIFO must adjust the FIFO read address to point to 
the next 256-bit word boundary in the FIFO. This can be achieved by considering the FIFO read address, 
read_adr[2:0], will require 3 bits to address 8 locations of 64-bits. The next 256-bit aligned address is cal- 
culated by inverting the MSB of the read.adr and setting all other bits to 0. 

If <hcuadvdot_count == hcii_nuin_dots) then 
read_adr [ 1 : 0 ) = bOO 
read_adr[2) = -read_adr[21 

The DIU Interface and Address Generator sub-block scaling unit also needs to know when 
hcuadvdot_count equals hcu_hum_jdots. This condition is exported from the HCUReadLineFIFO as the 
signal hrfjicuendofiine. When the hrfjicuendofline is asserted the scaling unit will decide based on verti- 
cal scaling whether to go back to the start of the current line or go onto the next line. 



25.S.9.2 DRAM Access Limitation 



The SFU must output 1 bit/cycle to the HCU. Since HCUNumDots may not be a multiple of 256 bits the 
last 256-bit DRAM word on the line can contain extra zeros. In this case, the SFU may not be able to pro- 
vide 1 bit/cycle to the HCU. This . could lead to a stall by the SFU. This stall could then propagate if the 
margins being used by the HCU are not sufficient to hide it. The maximum stall can be estimated by the 
calculation: DRAM service period - X scale factor * dots used from last DRAM read for HCU line. 
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25.8.10 Dili Interface and Address Generator Sub-block 



Table 119. DIU interface and Address Generator Additional lO Description 









Internal LBDPrevLlneFIFO inputs ' 


pff.diun'eq 




In 


Signal indicating the LBDPrevUneFIFOhas 2S6-bils of data 
free. 


pILdiuradc 




Out 


Acknowledge that read request has been accepted and 
plf^diurreq should be do-asserted. 


plLdiurdata 




Out 


Data from the DIU to LBDPmvUneFiFO, 
Rrst 64-bits are bits 63:0 of 256 bit word 
Second 64-blts are bits 1 27:64 of 256 bit word 
Third 64-btts are bits 191:128 of 256 bit word 
Fourth 64-bits are bits 255:192 of 256 bit word 


plf^diunvalid 




Out 


Signal indicating data on piLdlurefata is valid. 


ptf.diuidle 




In 


Signal indicating DIU state-machtne is in the IDLE state. 


intemaJ LBDNextLIneHFO Inputs 


nlfjdiuwreq 




In 


Signal Indicating tiie LBDNextLineFiFO has 256-blts of data 
for writing to the DfU. 


nlf_diuwacl( 




Out 


Acknowledge from DIU ttiat write request has been 
accepted and write data can be output on nlf_dluwdata 
togetiier with niLdiuwvatid. 


ntf.diuwdata 




In 


Data from LBDNwctUneFIFO to DIU Interface. 
Brst 64-bits are bits 63.-0 of 256 bit word 
Second 64-blts are bits 127:64 of 256 bit word 
Third 64-blts are bits 1 91 : 1 28 of 256 bit word 
Fourth 64'bits are bits 255:192 of 256 bit word 


ntf_diijwvalid 




In 


Signal indk»ting that data on wICdiuwdata Is valid. 


Internal HCUReadLineRFO Inputs 


hrf.hcuendofline 




In 


Signal lasting 1 cyde Indicating ttten end of the HCU read 
line. 


hff_jcadvance 




Out 


Signal from horizontal scaling unit 
1 > supply the next dot 
1 - supply the current dot 


hrf_diun«q 




In 


Signal indicating the HCUReadLineFIFO has space for 256- 
bits of DIU data. 


hrf.dlurack 




Out 


Acknowledge that read request has been accepted and 
hrt_diurwq should be de-asserted. 


hrLdiurdata 




Out 


Data from HCUReadUneFiFO\o DIU. 
First 64^its are bits 63:0 of 256 bit word 
Second 64-bits are bits 127:64 of 256 bit word 
Third 64-bits are bits 191:126 of 256 bit word 
Fourth 64-btts are bits 255:192 of 256 bit word 


hrf_dlurvalid 




Out 


Signal indicating data on pif_diurdata Is valid. 


hrfjdiukfle 




In 


Signal indicating DIU state-machine is In the IDLE state. 
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25.8.10.1 General Description 

m DlU Interface and Address Generator (DAG) sub-block manages the bi-lcvel buffer in DRAM. It has a 

All DRAM address management is centralised in the DAG. DRAM access is pre-emptive i.e. after a FIFO 
unit has made an access then as soon as the FIFO has space to read or data to write a DIU access will be 

'*1^^?ff- there are no unnecessary stalls introduced e.g. at the end of an LBD 

or HCU line. 

The conttollogic for horizontal and vertical non-integer scaling logic is completely contained in the DAG 
sub-block. The scaling control unit exports the hlf^vance signal to the HCUReadLineFIFO which indi- 
cates whether to replicate the current dot or st^ply die next dot for horizontal scaling. 

25.8.10.2 DIU Write Interface 

The LBDWextLineFIFO eentnites all the DIU write interface signals directly except for 
jyu_rfi«/_M/ad>-/2y. 5/ which IS generated by the Address Generation logic 

The DIU request from the LBDNextLineFIFO wiU be negated if its respective address pointer in DRAM is 
^Su '^q = 0. The implementation must ensure that no erroneous requests occur on 



nH_dluwreq 



nH,adrvaild. 



& 



wrlte^req 



nlf_<fiuwack 



nif.diuwdata 6^ 


nIf.diuwvaUd 



DIU 

- -Write 

Internee ^ 



stu_dHi_wreq 



diu_sfujffack 



64 



y ► 5hj_d]u_data[63:0} 
► sfu.dtujwvalld 



Figure 134. DIU Write Interface 



25.8. iO.3 DIU Read Interface 



Both HCUReadLineFIFO and LBDPrevLineFIFO share the read interface. If both sources request simul^ 
uneously then the arbitration logic implements a round-robin sharing of read accesses between the HCU- 
ReadLineFIFO and LBDPrevLineFIFO. 

The DIU read request arbitration logic generates a signal, selectjirfplf. which indicates whether the DIU 
^f^^A c ^^/^CUReadLineFIFO or LBDPrevLineFIFO {0-HCUReadLineFlFO, 1 - LBDPrevUne^ 
nTSii^L^/ ^^^^^^-^C^^^/^^tiploxing the returned DIU acknowledge and read data to either 

the HCUReadLineFIFO or LBDPrevLineFIFO, 
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hrl_<iiufdle 
plf_diuidlo 



1/ 



hrl.tfiurack 



pif.diurack 



hrf^dfurdata ^ 



64 



pHLdluidata ^ 



64 



^1 



V 



X dlu_pfij_daiaI63:0) 



piCdIurvalkl ^ 



s 



hrC<fiurvand ^ 



' diu_8fu_ivaSd 



Figure 135. OIU Road interfeee multiplexing by sel9Ct_hrfplf 

arbitration logic is shown in Figure 136. The arbitration logic wiU select a DIU read 
request on hrf_diurreq or plf_diurreq and assert sju_diu^rreq which goes to the DIU. The aecompanS 
DIU read address « generated by the Address Generation Logic. The select signal^etecr hiplf^^^Z 

the DFut'r J -^Jnowledges the request on diu^JU^ack. Arbitration cannot take plV^e aginlS 
he DIU state-machine of the arbitration winner is in the idle state, indicated by diu idle "niis is necessan^ 
to ensure that the DIU read data is multiplexed back to the HFO that request^ h ^ 



lirf_<&jwreq 



hrf.adrvalld 



pjLdhrtvreq 



piLadrvand, 



& 



& 



diu_sfu_rack . 
diu_idie 



Read Request 
Arbitration Logic 



2 

history 
> 



4^ 



busy 



select_hffplf 
-^sfu_d]u_rreq 



Figure 136. DIU read request arbitration logic 
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The DIU read requests from the HCUReadLineFIPO and LBDPrevLineFIFO will be ncBated if their 
ensure that no erroneous requests occur on j>i<_«/iM_n-e?. «i«Monmu5t 

^X^.a^^^''^^'^''"^^ ^ A5/)Prev£/„eF/FO request simultaneously, then if the request is not fol- 

T r ? r.T*"' '^'^ ^ »«>gi^ choose the S^SS- 

F/TO by default If there are back to back requests to the DIU read port then thTaAitraSi^X 
^p^ments a lound-robin sharing of read;accesses between the HCURZLtrnFo l^^ToP^l^t 

A pseudo-code description of the DIU read arbitration is given below. 

/'/ ?nic:2:ii:t!onTrrer' «<=™«'»-i-"FO. plf ,s LBDPrevLineriPO 

select_hrfplf = 0 // default choose hrf . 

history = none // no DIU read access immediately preceding 

^/ ll^nV^^'^t'"' between asserting «f«_diu_rr«i and di«_id2e - 1 

if 51iSJrf/rr"r ^» ^» ^<"- ««« '^en d-as-ert busy 

busy K 0 

^^^f^?'''^^'^^^^ received from Dx6 then de-assert Diu request 
. if (diu^sfu^rack == 1) then 

//de-assert request in response to aclcnowledge 
sfu_diu_rreq =0 

I 

// if not busy then arbitrate between incoming requests 
// If request detected then assert busy 
if (busy B= 0) then 

//if there is no request 

if (hrf^diurreq == 0) AND <plf_diurreq 0) then 

s f u_d.iu_rreq » 0 

history «= none 
// else there is a request 
else ( 

// assert busy and request Diu read access 
busy B 1 
sfu_diu_rreq a 1 

// arbitrate in round-robin fashion between the requestors 
// xf only HCUReadLineFIFO requesting choose HCUReadLineFiFO 
xf (hrf^diurreq == 1) AND (plf^diurreq 0) then 

history = hrf 

select_hrfplf = 0 
//if only LBDPrevLineFIFO requesting choose LBDPrevLineFIFO 
If (hrf_diurreq O) AND (plf.diurreq 1) then 

history = plf 

select_hrfplf = 1 
//if both HCUReadLineFIFO and LBDPrevLineFIFO requesting 
If (hrf^diurreq == i) AND (plf_diurreq 1) then 

// no immediately preceding request choose HCUReadLineFIFO 

if (history n= none) then 
history = hrf 
select_hrfplf « o 

// if previous winner was HCUReadLineFIFO choose LBDPrevLineFIFO 
elsif (history hrf) then 

history = plf 

seloct_hrfplf = 1 

//if previous winner was LBDPrevLineFIFO choose HCUReadLineFIFO 
elsif (history a= pif ) then 
history » hrf 
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select_hrfpl£ = 0 
// end chere is a request 

25.8.10.4 At/dress Generation Logic 

The DIU interface genenUes the DRAM addresses of data read and written by the SFU's FIFOs 

tn^^eZfA^Z^r ^^f^^'^f '-^f °« nl/.diuw,^ causes a write request from the DIU Write 
Interface. The Address Generator supphes the DRAM write address on sfu diu_ymdr[21:5J. 

t^^r^^^TL^^'^J^^ l^*'^ ^^"^'^ * ^«<l"«t from the DIU 

Read Interface. The Address Generator suppUes the DRAM read address on sju_dmjradr[2l:5]. 



sfuao 
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8fu_dfu_radf(21:5J ^ 
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My end_8fu_adr 
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htf^ctarUadrvaltd 






nrf.adrvaSd 
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pif.adrvalld 


V 




► 



Figure 137. Address Generation 

TTbe address generator is configured with the number of DRAM words to read in a HCU line 

Address Generation 

There are four address pointers used to manage the bi-level DRAM buffer 

a. hcu^eadline^adr[21:5] is the read address in DRAM for the HCUReadLineFlFO 

e. lbdjiextline_fldr[21:5] is the write address in DRAM for the LBDNextLineFIFO. 
a. lbd_prevline_adrpl:5J is the read address in DRAM for the LBDPrevLineFIFO. 
The current value of these address pointers are readable by the CPU. 

Four corresponding address valid flags are required to indicate whether the address pointers are valid: 

a. hyiadrvalid. 

b. hlfjstctrt_adrvalid, 

c. nlf^adrvalid, 

d. pi/jadrvalid. 
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DRAM requests from the FIFOs wiU not be issued to the DIU until the appropriate address flag is valid. 

S Sfn ™ AM "^'^T'*^^ ^dress generation logic can calculate the address of the next 
256-bit word in DRAM, ready for the next request 

Rules for address pointers 

The address pointers must obey certain rules which indicate whether they are valid: 

a. hcu_readline_adrf21:SJ is only valid if it is reading earlier in the line than 
lbd_nextline_adr[2I:5] is writing i.e. hlfjadrvalid = hcu readline adrni SJ /= 
lbd_nextlme_adr[21:5]. ~ ~ ' ' "'^ • 

b. The SFU cannot ovenvrite the current line that the HCU is reading from i.e. hlfjtartadrvalid = 
lbd_nextline_fldr[21:5] /= hcujstartreadline_adr[21:5]. 

c. The LBDNextLineFIFO must be writing carUer in the line than LBDPrevLineFIFO is reading 
and must not overwrite the current line that the HCU is reading Iromi.c. nlf_adrvalid = 
lbd_nextline_adrpi:5J /= lbd_pr€vline_adrpi:5] AND kcu_startreadline_valid 

AThe LBDPr^UneFIFO can read right up to the address that LBDNextUneFIFO is writing i e 
plf^adrvalid « lbd^revline_adr[21:5J lbd_nextline_adrpi:5J. ' ' 

0. At startup i.e. when sfii_go is asserted, the point«rs are reset to stan_sju_adrpi 57 The first 
LBD NextLineFIFO date is allowed to be written to lbd_nextline_adr[2I:SJ even though 
ni^_adrva/W is initially invalid. 

1. The address pointers can wrap around the SFU bi-Ievel store area in DRAM. 
X scaling of data for HCUReadLineFIFO 

-V^ sxgM hcu^Ju_advdot teUs the HCUReadUneFIFO to supply the next dot or the current dot on 
t^H^ni^n"^!^'^^^ to they_^«rfvance signal from the scaling control unit. When hrf^dvance is 1 
tte HCUReadUneFIFO should supply the next dot. When hrf^advance is 0 the HCUReadLineFIFO 
should supply die current dot ^i- v 



—5^ 


xstartLoount 


— ► 




hrf_xB<tvance 




xscale_num 








► 




xscafe^denom 


— ► 


X Scaling Control 










— ^ 








hrf«hcu_endQffine 




Unit 








Naj_sfu_advdot 


— ^ 

















Figure 138. X scaling control unit 

''f^/''" non-integer scaling is described in the pseudocode below. Note, x^scale^count should 

hrfh^il:^ H'''"'u'^J^f!^n' "^^^ ^""^ ^^^^^ '^^ ^« is indicated by 

nrj^hcuendqfline from the HCUReadLineFIFO, 

if (hcu_sfu_dotadv == i) then 

If (x_scale_count ♦ x_scale_denom - x_scale_nuin >« 0) then 

x_scale_count = x^scale.count ♦ x_scale_denom - x_scale num 
arf-.xadvance « 1 ~ 
else 

x_scale_count = x_scale_count + x_scale_denom 
hrf_xadvance » 0 

else 
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x_scale_count e x«scale_count 
hrf_xadvarice a 0 



Y scaling of data for HCUReadLineFIFO 

if (hrf_hcu_en<lofline == 1) then 

if <y_scale_coiu»t ♦ y.ecale.denom - y.ocale.nun. >= O) then 

else 

y.ecale.count » y.scale.count ♦ y_scale_denom 
hrc^advance = 0 

else 

y^scale^count = y_scale_count 
hrf^yadvance « 0 





yscale_nixm 


— ► 




. hj1_yac*vance 


a- 


yscale^denom 
hrf_hcu_endofBr>e 


— ► 


Y Scaling Control 
Unit 






p 









Figufd 139. Y scaling control unft 

^7nuLei!^^^^^ 'T'"'*^ ? ' ^ ^^^^^ ^^^^^^^ go back to the stazt of the 

current line, by setting hrfLyadvance - 0, or go onto the next line, by setting hrf J^dvance = 1 . 

//if end of HCU line and advance to next line 
if (hrf_hca_endofline l) and (hrf^advanco =. 1) then ( 
//advance to start of next HCU line in DRAM 

y/aTrnrf''^'''^^i''^-*'^'' " hcu.startreadline_adr . hcu.dranuwords 
//allow for address wraparound ^-woiraa 

offset = hcu_startreadline_adr - end^efu adr 
If (offset >=» 0) then 

^ hcu^fitartreadline.adr « atart^sf X4_adr + offset 
hcu^readline.adr ^ hcu^startreadline_adr 

hi ^ " ^' (hrf^advance 0) then 

hcu_readlLne_adr « hcu_startreadline_adr 

Figure 140 shows an overview of X and Y scaling for HCU data. 
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hcu_8tartreadUne.adr hcu_re8dUne.adr 



start of next hcu Cne In DRAM = 

hcu_startreadUne_adr ♦ hcu^dram^words 



ORAM 



read from ORAM 



When ORAM reads lor JIne 
are oompfeta advance to next 
Sne or return to start of current line 
aco ordlnotoY-sca&Tg . 




hcu_sfu_advdot 



256 bits 



256 bits 



brf^xadvar^ 



X-scale 
logic 



y 



sfu_hcu_sdata 

r 

HCURe«dLi nePXFO 



^«u_sfu_advdot 



Address generator pseudo-code: 



Figure 140. Overview of X and Y scaling at HCU interface 



Initialization: 

if (sfu_go rising edge) then 

//set flag to allow first write 
init - 1 

i^"n^i''h'''^ address pointers to start of SFU address space 

lbd_prevline_adrC21:5} = start_8fu_adr(21 • 5] 

lb<L.nextline_adrI21:5) = start_sf u_adr [21 : 5] 

hcu_readline_adr[21:51 - start^sfu adr(21-5J 
,,.^*'"-^*'"''''""*^^'*^-*^^C21:5I « starLsfu.adr[2i:53 
//xf first write complete 
elsif (plf_adrvalid i) then 

// reset flag allowing first vorite 

init e 0 



Address valid signals: 

hrf^^fr^^"^'' - hcu^readline^adr Ibd^nextline.adr 
nlfi^Il T .° ^^^----^^i'^^-«<^ «= hcu^atartreadline adr 

Address pointer updating: 

/ / LBDNex t LineF IFO 

//if piU write acknowledge and LBDNex tLinePlFO address is valid 
If (dxu_sfu^wack « 1 AND rilf.adrvalid) then 
//xf end of SFU address range 
if (lbd_nextline_adr == end_sfu_adr) then 
//go to start of SFU address range 
lbd„nextline_adr = start_sfu_adr 
else 

//increment address pointer 
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Ibd^nextline^adr » lbdLnextline_adr * 1 
// LBDPrevLineFIFO 

lbdj>revline_adr = start.sfu adr 
else " 

lb<Jj)revHne,adr = lbdj>revl ine_adr + i 

// HCUReadLineFlFO 

offset « hcu_8tartreaclline^adr - endLsfu adr 
if (offset >=s 0) then 

^ hcu_startreadline.adr = start.sfu^adr * offset 
hcu^readline^adr « hcu^startreadline.adr 

elsxf (hrf hcu^endofline 1) AND (hrf^advance == 0) then 

hcu^readlxne,adr = hcu.startreadllne.adr 
//if pointing to end of SPO address space 
elsxf {hcu,readline_adr end^sfu_adr> then 

//go to start of SFU address space 

hcu^readline^adr « start sfu adr 
else " 

//increment address pointer 
hcu^readline^adr » hcu_readline_adr + 1 
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26 Tag Encoder (TE) 



26.1 Overview 



ORAM 
intaibce 



tag 
encoder 



tag FIFO 
unit 



hsIRoner/ 
compos ttor 



PCU 4 

te.Onlshedband 



Figure 141, High level block diagram of TE in context 

^V^SJ^S^ ^ ' "i""* «''"'«l-'>«y printed with an infrared-absorptive ink that 

52 cycles within PEcT Uth^ <Zr.ovn cy<^les. This is actually accomplished in approximately 

no Jnal or^T^^c^ete i^^^^^^^^^^ if^tr^'^ pn>ductio„^^r cycle to I 

ywic snouia not lose the 63/52 cycle perfonnance edge attained in the PECl TE. 
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26.2 



What are tags? 



The first barcode was described in the late 1940's by Woodland and Silver, and finally patented in 1952 
(US Patent 2,612,994) when electronic parts were scarce and very expensive. Now however, with the 
advent of cheap and readily available computer technology, nearly every item purchased from a shop con- 
tains a barcode of some description on the packaging. From books to CDs, to grocciy items, the barcode 
provides a convenient way of identifying an object by a product number. The exact interpretation of the 
product number depends on the type of barcode. Warehouse inventoiy tracking systems let users define 
theu- own product number ranges, while inventoiy in shops must be more universally encoded so that prod- 
ucts from one company don't overlap with products from another company. Universal Product Codes 
(UPC) were introduced in the mid 1970's at the request of the National Association of Food Chains for 
this very reason. 

Barcodes themselves have been specified in a large number of formats. The older barcode foimats contain 
characters that are displayed in the form of lines. The combination of black and white lines describe the 
information the barcodes contains. Often there are two types of hues to form the complete barcode: the 
characters (the information itself) and lines to separate blocks for better optical recognition. While the 
information may change from barcode to barcode, the lines to separate blocks stays constant. The lines to 
separate blocks can therefore be thought of as part of the constant structural components of the barcode. 
Barcodes are rc»d with specialized reading devices that then pass the extracted data onto the computer for 
further processing. For example, a point-of-sale scanning device allows the sales assistant to add the 
scanned item to the current sale, places the name of the item and the price on a display device for verifica- 
tion etc. Light-pens, gun readers, scanners, slot readers, and cameras are among the many devices used to 
read the barcodes. 

To help ensure that the data extracted was read correctly, checksums were introduced as a crude form of 
error detection. More recent barcode formats, such as the Aztec 2D barcode developed by Andy Longacie 
in 1 995 (US patent number US559 1956). but now released to the public domain, use redundancy encoding 
schemes such as Reed-Solomon. Reed Solomon encoding is adequately discussed in [24], [26] and [30]. 
The reader is advised to refer to these sources for background information. Very often the degree of redun- 
dancy encoding is user selectable. 

More recently there has also been a move from the simple one dimensional barcodes (line based) to two 
dimensional barcodes. Instead of storing the information as a series of lines, where the data can be 
extracted from a single dimension, the information is encoded in two dimensions. Just as with the original 
barcodes, the 2D barcode contains both information and structural components for better optical recogni- 
tion. Figure 142 shows an example of a QR Code (Quick Response Code), developed by Denso of Japan 
(US patent number US5726435). Note the barcode cell is comprised of two areas: a data area (depends on 
the data being stored in the barcode), and a constant position detection pattern. The constant posirion 
detection pattern is used by the reader to help locate the cell itself, then to locate the cell boundaries, to 
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allow the reader to determine the original orientation of the cell (orientation can be determined by the fact 
that there is no 4th comer pattern). 




Figure 142. Example QR Code developed by Denso of Japan 

The number of barcode encoding schemes grows daily. Yet very often the hardware for producing these 
barcodes is specific to the particular barcode foraiat As printers become more and more embedded, there 
is an increasing desire for real-time printing of these barcodes. In particular, Netpage enabled applications 
require the printing of 2D barcodes (or tags) over the page, preferably in infra-red ink. The tag encoder in 
SoPEC uses a generic barcode format encoding scheme which is particularly suited to real-time printing. 
Since the barcode encoding format is generic, the same tendering hardware engine can be used to produce 
a wide variety of barcode formats. 

Unfortunately the term '"barcode" is interpreted in different ways by different people. Sometimes it refers 
only to the data area component, and does not include the constant position detection pattern. In other 
cases it refers to both data and constant position detection pattern. 

We therefore use the term tag to refer to the combination of data and any other components (such as posi- 
tion detection pattern, blank space etc. surround) that must be rendered to help hold or locale/read the data. 
A tag therefore contains the following components: 

• data arca(s). The data area is the whole reason that the tag exists. The tag data area(s) contains the 
encoded data (optionally redundancy-encoded, perhaps simply checksummed) where the bits of the 
data are placed within the data area at locations specified by the tag encoding scheme. 

• constant background patterns, which typically includes a constant position detection pattern. These 
help the tag reader to locate the tag. They include components that are easy to locate and may contain 
orientation and perspective information in the case of 2D tags. Constant background patterns may also 

. include such patterns as a blank area surrounding the data area or position detection pattern. These 
blank patterns can aid in the decoding of the data by ensuring that there is no interference between tags 
or data areas. 

In most tag encoding schemes there is at least some constant background pattern, but it is not necessarily 
required by all. For example, if the tag data area is enclosed by a physical space and the reading means 
uses a non-optical location mechanism (e.g. physical alignment of surface to data reader) then a position 
detection pattern is not required. 

Different tag encoding schemes have different sized tags, and have different allocation of physical tag area 
to constant position detection pattern and data area. For example, the QR code has 3 fixed blocks at the 
edges of the tag for position detection pattern (see Figure 142) and a data area in the remainder. By con- 
trast, the Netpage tag structure (see Figures 143 and 144) contains a circular locator component, an orien- 
tation feature, and several data areas. Figure 143(a) shows the Netpage tag constant background pattern in 
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15 



Netpage tag. Note Lt Fi^ m a s^gle hTotd^Lt " ""'"""l "'^•*^« '^^^ « 
fonn a block within the data^I ^ ^ ^presented by many physical output dots to 




(a> Natpaga tag liaekerMmd pattern 




(b) Netpase tag slKMlng data t 
Rgure 143. Netpage tag stnjcture 




26.2.1 



Figure 144. Netpage tag with data rendered at 1600 dpi (magnified view) 
Contents of the data area 

The data area contains the data for the tag 
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ning resolution. For example, in the QR code (see Figure 142). a single bit is represented by a dark module 
or a light module, where the exact number of dots in the dark module or light module depends on the ren- 
dering resolution and target reading/scanning resolution. For example, a dark module may be represented 
by a square block of printed dots (all on for binary I, or all off for binary 0), as shown in Figure 145 




Figure 145. Example of 2x2 dots for each block of QR code 

The point to note here is that a single bit of data may be represented in the printed tag by an arbitrary 
printed shape. The smallest shape is a single printed dot, while the largest shape is theoretically the whole 
tag itself, for example a giant macrodot comprised of many printed dots in both dimensions. 

An ideal generic tag definition structure allows the generation of an arbitrary printed shape from each bit 
of data. 

26.2.2 What do the bits represent? 

Given an original number of bits of data, and the desire to place those bits into a printed tag for subsequent 
retrieval via a reading/scanning mechanism, the original number of bits can either be placed directly into 
the tag. or they can be redundancy-encoded in some way. The exact form of redundancy encoding will 
depend on the tag format. 

The placeinent of data bits within the data area of the tag is directly related to the redundancy mechanism 
employed in the encoding scheme. The idea is generally to place data bits together in 2D so that burst 
errors are averaged out over the tag data, thus typically being correctable. For example, all the bits of 
Reed-Solomon codeword would be spread out over the entire tag data area so to minimize being affected 
by a burst error. 

Since the dato encoding scheme and shape and size of the tag data area ai^ closely linked, it is desirable to 
have a generic tag format structure. This allows the same data structure and rendering embodiment to be 
used to render a variety of tag formats. 

26,2.Z i Fixed and variable data components 

In many cases, the tag data can be reasonably divided into fixed and variable components. For example, if 
a tag holds N bits of data, some of these bits may be fixed for all tags while some may vary from tag to tag. 
For example, the Universal product code allows a country code and a company code. Since these bits don't 
change from tag to tag, these bits can be defined as fixed, and don't need to be provided to the tag encoder 
each time, thereby reducing the bandwidth when producing many tags. 
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the overall bandwidth can be reduced ^ ^ ' ^8 ^^'^ ^^''^ *^ 

plctely variable. wh?leSi^ti%nc,S^ ? encoder may be con.- 

of tag data bits. ^ '^'"'''^ « s tag encoder may have a maximumnumber 

26.2.Z2 Redundancy-encode the tag data within the tag encoder 

26.3 Placement of tags on a page 

TTie TE places Ugs on the page in a triangular grid arrangement as shown in Figure 146. 



Portrait orionfation 



dot direction 
► 



Landscape ortontatfon 



0 0 0 
0 0 0 




tag 



dot direction 
► 




® ® 



Line direction 



Una direction 

Figure 146. Placement of tags for portrait & landscape printing 

tT^t.^T^'^ ""V^ of tags combined ivith the restriction of no overiap of columns or rows of tags means 
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TT>e geneial case for placement of tags therefore relies on a number of parameters, as shown in Figure 147. 

" " ^ dot dlradlon 



Start POsiUon 



AMTagUnePosffion 



tag writhin 
tag's bounding 
box 



Tag width 
< ► 



Dot Inter-tag gap 



^ tag within ) 



Une {nter-tag gap 



line dtrecUon 



tagwtthin 
tag's (bounding 
box 



tag within 
tag'b txMjnding 
box 



Tag height 



Dot Inter-tag gap 
^ ►( 



tag wtthtn 
tag's txMjnding 
tkox 



Figure 147. General representation of tag placement 
n'^l'^r^SS!'" ""^"""^ - -^^'^ Note that these are placement parameters and 

Table 120. Tag placement parameters 





The^number of dot lines In a tag'e^^^^^^^^ 




Tag height 




minimum 1 


Tag width 


The number of dots in a single line of the tag's bound- 
ing box. The number of dots in the tag itself may vary 
depending on the shape of the tag. but the number of 
dots In the bounding box will be constant (by defini- 
tion). 


minimum 1 


Dot inter-tag gap 


The number of dots firom the edge of one tag's bound- 
ing box to the start of the next tag's bounding box in 

the dot direction. 


minimum = 0 


Une inter-tag gap 


The number of dot lines from the edge of one tag's 
bounding box to the start of the next tag's bounding 
box, in the line direction. 


minimum = 0 


Start Position 


Defines the status of the top left dot on the page - is an 
offset in dot & row %vithin the tag or the inter-tag gap 




AltTagUnePosition 


Defines the status fbr the start of the alternate row of 
tags. Is an of^t in dot within the tag or %vithin the dot 
inter-tag gap (the row position is always 0) 





26.4 Basic tag encoding parameters 
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pie matter to adjust the buffer sizes and corresponding 
in future implementations. 



Table 121. Encoding parameters 



addressing to allow arbitrary encoding parameters 





^mmmmm&m 








2^* dotpairs or 20.48 inches at 1600 dpi 


s 

N 


tag size 

number of dots in each dimension of the tag 


typical tag size is 2mm x 2mm 
maximum tag size is 384 dots x 384 dots 
before scaling i.e. 6 mm x 6 mm at 1600 dpi 


E 


redundancy encoding for tag data 


384 dots before scaling 
Reed-Solomon GF{2*) at 5:10 or 7:8 


Of 


srze of fixed data (unencoded) 


40 or S6 bits 


% 

Dy 


size of redundancy-encoded fixed data 
srze of variable data (unencoded) 


120 bits 




size of redundancy-encoded variable data 


120 or 112 bits 
360 or 240 bits 


T 


tags per page width 


85 packed 6mm x 6mm tags (384 x 384 

dots) will fit in 20.48 Inches 



hitc ftf ,.n-„.«^rj 7 . J V*; . """^ °^ suppuea to me 1 E once. It can be suppUed as 40 or 56 
III! ^ °° ° '^'""*'»'''^"™'^«»'l>«™ triable for e«4iM Viri 



26.4.1 Redundancy encoding 



^""^ ^"^^^'''^ to. redundancy encoded bits telies heavily on the 
meAod of redundancy encoding employed. Reed-Solomon encoding was chosen for its abili^to de^ 
burst errors and effectively detect and correct errors using a miiSmuxn of redlSic^^ iSed stfolt 

^i:. Sor2~' ^ ''''' '''' advis^i^otft^ihr^^LT 

In this implementation of the TE we use Reed-Solomon encoding over the Galois Field GFfl'*^ Svml^l 
Of the 15 symbols, there are two possibilities for encoding- 

' S^'i ^ ''•*^>' 10 redundancy symbols (40 bits). The 10 redundancv 

-'^aXW,.!?;;^" ' ^""'^ " «--torWoLal1s A^SforJ^r? 

' f^l^lPnLTT^^' '^'^ ^""^ ^ redundancy symbols (32 bits). The 8 redundancy 

In the fct case with 5 symbols of original data, the total amount of original data per tag is 160 bits f40 
aWej'al foZ? " "^""'^'^^ ^ ambunttf 480 bits^lio'fSedf 36^121° 
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• Each tag contains up to 40 bits of fixed original data. Therefore 2 codewords are required for the fixed 
data, giving a total encoded data size of 120 bits. Note that this fixed data only needs to be encoded 
once per page. 

• Each tag contains up to 120 bits of variable original data. Therefore 6 codewords are required for the 
variable data, giving a total encoded data size of 360 bits. 

In the second case, with 7 symbols of original data, the total amount of original data per tag is 168 bits (56 
fixed, 112 variable). This is redundancy encoded to give a total amount of 360 bits (120 fixed, 240 vari- 
able) as follows: 

• Each tag contains up to 56 bits of fixed original data. Therefore 2 codewords are required for the fixed 
data, giving a total encoded data size of 120 bits. Note that this fixed data only needs to be encoded 
once per page. 

• Each tag contains up to 1 1 2 bits of variable original data. Therefore 4 codewords are reqiiired for the 
variable data, giving a total encoded data size of 240 bits. 

The choice of data to redundancy ratio depends on the application. 

26,5 . Data structures used by tag encoder 

26.5.1 Tag Format Structure 

The Tag Format Structure (TFS) is the template used to render tags, optimized so that the tag can be ren- 
dered in real time. The TFS contains an entry for each dot position within the tag's bounding box. Each 
entry specifies whether the dot is part of the constant background pattern or part of the tag's data compo- 
nent (both fixed and variable). 

The TFS is very similar to a bitmap in that it contains one entry for each dot position of the tag's bounding 
box. The TFS therefore has TagHeight x TagWidth entries, where TagHeight matches the height of the 
bounding box for the tag in the line dimension, and TagWidth matches the width of the bounding box for 
the tag in the dot dimension. A single line of TFS entries for a tag is known as a tag line structure. 

The TFS consists of TagHeight number of tag line structures^ one for each 1600 dpi line in the tag's 
bounding box. Each tag line structure contains three contiguovis tables, known as tables A, B, and C. Table 
A contains 384 2-bit entries, one entry for each of the maximum number of dots in a single line of a tag 
(see Table 121). The actual nimiber of entries used should match the size of the bounding box for the tag in 
the dot dimension, but all 384 entries must be present^ Table B contains 32 9-bit data addresses that refer 
to (in order of appearance) the data dots present in the particular line. All 32 entries must be present, even 
if fewer are used. Table C contains two 5-bit pointers into table B, and is stored in the 10 low bits of the 
next 32-bit word (the upper 22 bits are unused). The total length of each tag line structure is therefore 34 x 
32>bit words. Padding (18 x 32-bit words) is inserted after every 7 tag line structures to keep each tag line 



. This is done so that it is possible to go from one line within a ug to the next by simply adding 33 in 32-bit based addressing to DRAM. 
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Structure 
requires i 



completely within a IKByte boundary (thus a TFS c^nt^i^ir.. Tu t. 

. 7^,Hei,H^ up KByte,, oil'^'t^t'^^ Z'!'^ 



Tag Fofmat Stnjcture 
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tag fine structure 1 



tag line structure 2 
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reserved and unused 
(16x32-bil8) 
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tag line structure 
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(384 entries x 2-blts) 
(768 bits) 
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(268 bits) 



table C 
(2 entries x 5-bits) 
(10 bits) 

reserved and 
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tag Ibie structure 8 
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Figure 148, Compoj^Won of SoPEC's tag format structure 
Am description of the interpretation and Usage of Tables A. B and C 



IS given m section 26.8.3 on page 



26.5. 1, 1 Scaling a tag 



alternative, the o^iS^^^^S^T^L^S ""T""^ ""^''^^ - TFS. As aj 

the both the TE and TlJ ^*^***'^P''''''^'*'^°"^^»«^^^ 1 in 

s^^^^s^di^wn^^^ 

would repeat each entry across each Un^ of TTS In tt ^ "fl^ ^""^ 

net number of entriesSthe TpiZ^Tul^:;^ fT^^^^^^ T '^^ '"^^ 

e^i;^^trxfd:rtat;:^^ ^ ^ ^'-p'« 

of the original dots was r^^^e^d ^yTn^terdo^T ''""i'' '""^ ^""^ 

dimension of the original ?FS. either^ incLri A^s^^^ ^ ^ 

scale-up on the output of the tag genLor outjr^^jfh^rpr^j,^^^^^^ 
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Instead, we can replace each of the original dots in the TFS by a 7 x 7 dot definition of a rounded dot Fig- 
ure 1 50 shows the results. wui. 
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(1 line aU dark) 
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Figure 149. Simple 3x3 tag structure 



Legend 
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Figure 150. 3x3 tag redesigned for 21 x 21 area (not simple reptication) 

^IT.TT^^'J'l ^^^oJution of the TFS the more printed dots can be printed for each macn>dot^ 

1""^"^^' represents a single data bit of the tag. The more dots that are available to produce a mac 

Ihnw;j!i?r ^"""^^ P^"^"" ""^'^^ "'^"'^^^ ^ ^ ^^^Pl^> ^^Sure 144 on page 360 

shows the Netpage tag structure rendered such that the data bits are represented by an average of 8 dots x 
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8 dots (at 1600 dpi), but the actuaJ shape stmcture of a dot is not square. This aUows the printed Netpaee 
tag to be subsequently read at any orientation. 

26.S.2 Raw tag data 

The TE requires a band of unencoded variable tag data if variable data is to be included in the tag bit- 
plane. A bandof unencoded variable tag data isaset of contiguous unencoded tag data records, in order of 
encounter top left of printed band from top left to lower right. 

An unencoded tag data record is 128 bits arranged as follows: bits 0- 11 1 or 0-1 19 are the bits of raw tae 
data^bit 120 is a flag used by the TE iTaglsPrinied). and the remaining 7 bits are reserved (and should be 
0). Havmg a record size of 128 bits simplifies the tag data access since the data of two tags fits into a 256- 
bit DRAM word. It also means that the flags can be stored apart from the tag data, thus keeping the raw tag 
data completely unrestricted. If there is an odd number of tags in line then the last DRAM read will con- 
tain a tag in the first 128 bits and padding in the final 128 bits. 

The TaglsPrinted flag aUows the effective specification of a tag resolution mask over the page. For each 
tag position the TaglsPrinted flag determines whether any of the tag is printed or not This allows arbitrary 
placemem of tags on the page. For example, tags may only be printed over particular active areas of a 
page. TTie TaglsPrinted flag allows only those tags to be printed. TaglsPrinted is a 1 bit flag with values as 
shown m Table 122. 

Table 122. TaglsPrinted values 





i 






0 


Don't print the tag in this tag posftion. 

Output 0 for each dot within the tag bounding box. 


1 


Print the tag as spedfied the various tag structures. 



26.5.3 



26.5.4 



DRAM storage requirements 

The total DRAM storage required by a single band of raw tag data depends on the number of tags present 
m band. Each tag requires 128 bits. Consequently if there are iVtags in the band, the size in DRAM is 
JoN bytes. 

The maximum size of a line of tags is 163 x 128 bits. When maximally packed, a row of tags contains 163 
tags (see Table 121) and extends over a minimum of 126 print lines. This equates to 282 KBytes over a 
Letter page. 

The total DRAM storage required by a single TFS is TagHeightH KBytes (including padding). Since the 
hkely maximum value for TagHeight is 384 (given that SoPEC restricts TagWidUh to 384). the maximum 
size in DRAM for a TFS is 55 KBytes. 

DRAM access requirements 

The TE has two separate read interfaces to DRAM for raw tag data. TD, and tag format structure, TFS. 
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^e^memory usage requirements are shown in Table 123. Raw tag data is stored in the compressed page 



Table 123. Memory usage requirements 



Compressed page store 



Tag Format Structure 



2048 Kbytes 



55 Kbyte (384 dot Dne tags 
9 1600 dpi) 



Compressed data page store for BMevel. contone and 
raw tag data. 



55 IcB in PEC1 tor 384 dot Hne tags (the benchmark) at 
1600 dpi 

2.5 mm tags (1/1 0th inch) ^ 1600 dpi require 160 dot 

lines = 1 60/384 xS5 or 23 kB 

2.S mm tags @ 800 dpi require 80/384 x5S s 12 kB 



The TO mterface will read 256-bits from DRAM at a time. Each 256-bit read returns 2 times 128.bit tags 

The TD mterface to the DIU will be a 256-bit double buffer. If there is an odd number of tags in line then 

the last DRAM read will contain a tag in the first 128 bits and padding in the final 128 bits. 

The TFS interface wiU also read 256-bits from DRAM at a time. The TFS required for a line is 136 bytes 

A total of 5 times 256-bit DRAM reads is required to read the TFS for a Une with 192 unused bits in the 

fifth 256-bit word A 136-byte double-line buffer will be implemented to store the TFS data. 

The TE's DIU bandwidth requirements are summaiized in Table 124. 



Table 124. DRAM bandwidth requirements 




TD 



TFS 



Read Single 256 bit reads^ 



Read 



Single 256 bit reads^. TFS is 
136 bytes. This means there 
is unused data in the fiflh 
256 bit read. A total of 5 
reads is required. 



1.02 



0.093 



1.02 



0.093 



1: Each 2min tag lasts 126 dot cycles and requires 128 bits. This is a rate of 256 bits every 252 cycles. 
2: 17 X 64 bit reads per line in PECl is 5 x 256 bit reads per line in SoPEC with unused bits in the last 256-bit read. 



26.5.5 Tag sizes 

SoPEC allows for tags to be between 0 to 384 dots. A typical 2 mm tag requires 1 26 dots. Short tags do not 
change the mtemal bandwidth or throughput behaviours at all. Tag height is specified so as to allow the 
DRAM storage for raw tag data to be specified. Minimum tag width is a condition imposed by throughput 
limitations, so if the width is too small TE cannot consistently produce 2 dots per cycle across several tags 
(also there are raw tag data bandwidth impUcations). Thinner tags still work, they just take longer and/or 
need scaling. * 
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26.6 Implementation 



26.6.1 Tag Encoder Architecture 

A block diagram of the TE can be seen below. 




Figure 151. TE Block Diagram 

Tags peTline wiS I aslfts p^^^^^ ^"P''^'' ^ '^^^^ (^mm densely packed) with 1 08 
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1^1^! °f TFS interface that loads and decodes TFS entries, a tag data interface that 

ate addressu^ and control signals. The TE has two separate read interfaces to DRAM for ^w togX 
TD, and tag format stnichire, TFS. ^ 

It is possible that the raw tag data interface, the TD. to the DIU could be replaced by a hardware state 
tfi^.ie_oiZltd^L^- ^""^ "^"^ ""'P"^ - -h'n 

26.6.2 Y-Scaling output lines 

&e°2r^„1!2-'*''?^ ^f^^*'"'^" '•^"•'-i^g modifications to the FECI TE are suggested to 
the Tag Dala Interfece. Tag Format Structure Interface and TE Top Level: B8C5«a lo 

* M^i^^T'^T ""'^ configuration registers of Table 126.firstTagUneHeigh, and tag- 
JSf r . " ™"l''Pl'^ "P by the scale factor YScale. Within ^e Tag Data interfefe 
Aere are two counter., counu and county that have a direct bearing on the rav^Tagola^J^ ^. 
tioa counu decrements as tags aie read from DRAM. It is reset to NumTags[RtdLsense] J^of 

sSi'nl"^" h "^J" '^^--^ed « each line of tags is completely rel/fiom iJ.Zn J 

- OJcalmg may be performed by counting the number of times countK reaches rero and only decre 
mentmg county this number reaches YScale. This wiU cause the TagData Interface to read Sh 
Ime of tag data NumTagsIRtdTagSenseJ * YScale times. inwnace to read each 

• for Tag Format Staicture Interface: The implication of Y-scaling for the TFS is that each Tag Line 
Structure is used YScale times. This may be accomplished in eith^ of two ways- ^ 

' ^^^^liltt^iV^^^T^ '^ ^" ^^"^'^ ^-^^"^^ ^« ™s involves gating 

^ con^ol of TFS bufTw flippmg with YScale. Because of the way in which this advT/sLine anl 
advTagLine related fimctronal.ty ,s coded in the FECI TFS this solution is judged to be error-prone 

* Slwr^u'jS^^'™''*^ ™' «»'^^o"in8 ^^e activity of currTf- 

JlLf^H^mn"^^ '""f,f!Pi?.^ "^"^^ *° "^'l ^''^^ individual Tag Line Struc- 

.W il^Pr * '^"^ "^^^ ^ ^"^^^ "T*^ fro" the behav- 

iour m PEC 1 where one address is given and 17 data-words were returned by the DIU 

Since the behaviour of thccurrl/sAddr must be changed to meet the requirements of the SoPEC 
nwJ^r^?r°" '"'r't.^^ ^-^'^^ ^bange i.e. a coJJt of the number of com- 

^^rt^^flTIT ^^I'^iy '^'^'^ O-'y ^ben this count equals YScale Z 

«^r/^^rf^ be loaded with the base address of the next lines Tag Line Structure ^ DRAM, other- 
wise It IS re-loaded with the base address of the current lines Tag Line Stmcture in DRAM. 

' rr^Iwii^"!l?r '^"^ ^''^^ "^^ ^ * ^bich is used to count the number of 

a when m a tag gap or in a line of tags. At the start (i.e. top-left hand dot-pair) of 

Z^ton Y^ltT^^ " "'•^"'"'^ '^^ accomplished by gating the decrement of L/neA,. 
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26.6.3 TE Physical Hierarchy 
Tag Encoder 




Top Level FSM 
+ PCU + Comb 
Logic for Muxfng 
etc« 



Raw Tag Data 
rnterrace 



Reed Sotonxm 
Encoder 



20 Decoder 



encoded 
fixed tag 
data 



encoded 
variable tag 
data 



Hag format Mrucluie ( 1 hijj 



Table A 



Rego/p 



Table C 



Table B 



Rego/p 



Figure 152. TE Hierart;hy 

multiple SoPECs prinUag a single lii) SoPEC may be only pnntmg part of a tag due to 

wiJuVa^^rlhriltriri^:?.'" •"'"•^^ l^"- ^ ^^^^P*" position is 

Of dots, moving through tags and inter-tag gaps ^^g i^^Sc^p^Lm'^r '"^ 
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I 26.6.4 lO Definitions 

Table 125. TE Port Ust 



Clocks and Resets 



pdk 


1 


In 


SoPEC Functional dock. 


prst_n 


1 


In 


Global reset signal. 


Bandstore Signals 


cdu^endofbandstorepl :5) 


17 


In 


Address of the end of the current band of data. 
256-bit word aligned ORAM address. 


odu.8tartofbandstore[21 'S\ 


17 


In 


Address of the start of the current band of data. 
2S6-bit word aligned DRAM address. 


teJRnlshedbend 


1 


Out 


TE finished band signal to PCU and ICU. 


PCU Inteifaee data and control signals 


pcu_addil8:2) 


7 


In 


PCU address bus. 7 bits are required to decode the address space 
fbrthis Mock. 


pcu_dataout[31:0] 


32 


In 


Shared write data bus from the PCU, 


tej>cu_datain[31:0] 


32 ' 


ChJt 


Read data bus from the TE to the PCU. 


pcu^rwn 


1 


In 


Common read/not-write signal from the PCU. 


pcu_t0_8el 


1 


In 


Block select from the PCU. When pcujtejsel is high both 
pctLacft/rand pcu^dataout are valid. 


le_pcu_rdy 


1 


Out 


Ready signal to the PCU. When te,j)cu_rdy Is high it indicates the 
last cyde of the access. For a write cyde this means pcu_dataout 
has been registered by the block and lor a read cyde this means 
the data on t0,j}cu^datain Is valid. 


TD (raw Tag Data) OIU Read Interfeoe signals 


td_dju_rreq 


1 


Out 


TD requests DRAM read. A read request must t>e accompanied by 
a valid read address. 


td_diu.radrf2t:51 


17 


Out 


TD read address to OiU. 

17 bits wide (256>bit aligned word). 


diu.fd_reck 


1 


In 


Acknowledge fmm DIU ttiat TD read request has been accepted 
and new read address can be placed on te_diu^radr. 


diu.data(63:0J 


64 


In 


Data from DlUtoTE. 
First 64-bits are bits 63:0 of 256 bit word; 
Second 64-bits are bits 127:64 of 256 bit word; 
Third 64-blts are bits 191:128 of 256 bit word; 
Fourth 64-bits are bits 255:1 92 of 256 bit word. 


diu.td.rva]kl 


1 


In 


Signal from DIU teHing TD that valki read data is on the diu^data 
bus. ~ 


TPS (Tag Format Structure) DIU Read Interface signals 


tfs.diu_rreq 


1 


Out 


TFS requests DRAM read. A read request must be accompanied 
by a valid read address. 


tfe_diu_radfC2 1 :5] 


17 


Out 


TPS Read address to DIU 

1 7 bits wide (256-bit aligned word). 


diu«tfe_rack 


1 


In 


Acknowledge from OIU that TFS read request has been accepted 
and new read address can be placed on tfs^diu_radr. 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 373 



SoPEC : Hardware Design 



Table 125. TE Port List 











diu.data[63:0] 


64 


In 


Data from OIU to TE- . « « r—i t fj ij. 

Rrst 64-bit3 are bits 63.-0 of 256 bit word; 
Second 64-bits are bits 1 27:64 of 256 bit iMOid; 
Third 64-brts are Wts 1 91 : 1 28 of 256 bit wwd; ' 
fourth 64-bits are bHs 255:192 of 256 bit word. 


diu_tfs_fvafid 

TFU Interface data and c€>nt 


1 

rof signal 


In 

6 


Signal from DIU telling TFS that vaDd read data Is on the diu data 
bus. ~ 


tfu_te_oktowrite 
te_tfu_wdatap:0] 


1 
8 


In 

Out 


Ready signal indicating TFU has space available and Is ready to be 
written to. Also asserted from the point that the TFU has recieved 
Its expected number of bytes tor a line until the next 
te^tfu wradvUne 

Write data for TFU. 


te.tfii.wdatavalid 


1 


Out 


Write data valid srgnaf. This signal remains high whenever there is 
valid output data on tejthj^wdata 


te_tfu_wradvline 


1 


Out 


AcJvance line Signal strobed when the last byte in a Gne is placed " 
on to tfu wdata 



26.6.5 Configuration Registers 



JSe^Sstfo^rdS^iSl'" f'l P«>g^ed via the PCU interface.Refer to section 21.8.2 on 

^l^JJZ * "'^ P^to^'*' tMng diagrams for reading and writing registeis in ftS 

S^^efd^eTowe^'fr^fThfpc^ a'SL'^^^*^"^ ^Zlpp^r^ ^iX^^Z^^ 

TETablel2t.itTe'rfi^:r^^^^^^ ^ ^ *e address space for the 

'"^''^ "^"^^^ ^bit DRAM word aligned as this is the case fop th« PPri tb 

these the 64.bit word aligned addresses on a 256.bit DRAM word boundaiy.. program 



Table 126, TE Configuration Registers 




Control registers 



0x00 



Reset 



0x04 



Go 




A writs to this register causes a reset of the TE. 
This register can be read to indicate the reset state: 

0 - reset In progress 

1 - reset not in progress 



Writing 1 to this register starts the TE. Writing 0 to this 
register halts the TE. 

When Go is deasserted the state-machines go to their 
idle states but all counters and configuration registers 
keep their values. 

When Go Is asserted all counters are reset, but con- 
figuration registers keep their values (I.e. they don't 
get reset). NextBandEnab/e is deared when Go is 
asserted. 

The TFU must be started before the TE is started. 
ThJs register can be read to determine if the TE is run- 
ni'ng (1 = running, 0 = Slopped). 
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Table 126. TE ConfiguraUon Registers 




> reglstera (constant for processing of a page) 



0x40 i TfsStartAdf 

(64-bft alfgned DRAM 
address • should start at 
a 256'bH aligned loca- 
tion) 



0x40 



0x50 



0x54 



0x58 



OxSC 



0x68 



OxSC 
0x70 



19 



0x44 I TfsEndAdr 19 
(64-bit aligned DRAM 
address • should start at 
a 256-bH aOgned loca- 
tion) 



0X48 I TfsRrstLJneAdr 

(64-btt aligned 
address) 



19 



OataRedun 



Deoode2DEn 



Variat)reDataPresent 



EncodeFixed 



TagMaxOotpairs 



TagMaxUne 
TagGapOoi 



TagGapLine 



DolPairsPerUne 
DotStartTagSense 



14 



Pbints to the first word of the ffrst TFS line In ORAM. 



Points to the first word of the last TFS line In DRAM.' 



Points to the ffrst word of the first TFS line to be 
encountered on the page. If the start of the page is in 
an mter-tag gap. then this value will tie the same as 
7FSStart4drsInce the first tag line reached will be the 
top line of a tag. 



Defines the data to redundancy ratio for the Reed 
Solomon encoder. Symbol size is always 4 Wts. Code- 
word size is always 1 5 symbols (60 bits). 

symbols (20 bits). 10 redundancy symbols 

(40 txts) 

L'l^^ symbols (28 bits). 8 redundancy symbols 
(32 bits) 



14 



Determines whether or not the data bits are to be 2D 
decoded rather than redundancy encoded (each 2 
bits of the data bits becomes 4 output data bits). 

0 = redundancy encode data 

1 = decode each 2 bits of data into 4 bits 



Defines whether or not there is variable data in the 
tags. If there Is none, no attempt is made to read tag 
data, and tag encoding should only reference fixed 
tag data. 



Oetenmines whether or not the lower 40 (or 56) bits of 
fixed data should be encoded Into 120 bits or simDly 
used as is. ' 



The width of a tag In dot-pairs, minus 1 . 
Minimum 0. Maximurm=i9i. 



The number of lines in a tag. minus 1 . 
Minimum 0, Maximum = 383. 

The number of dot pairs between tags in the dot 
dimension minus 1. 
Only valid if TagGapPresentjbh 0] = 1 . 



Defines the nun^r of dotlines between tags in the 

line dimension minus 1 . 

Only valid it TagGapPmsen^bHU = 1 . 



Number of output dot pairs to generate per tag line. 



Determines for the first/even (bit 0) and second/odd 
(bit 1 ) rows of tags whether or not the first dot position 
of the line is in a tag. 
1 s in a tag, 0 g in an inter-tag gap. 
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Table 126, TE Configuration Registers 



0x74 



0x60 to 
0x84 



0x88 to 
0x8C 



DotStartPos 



NumTags 



Setup band related registers 



2x14 



2x8 



Bit 0 is 1 if there Is an inter-tag gap In the dot dimen 
8ion. and 0 If tags are tightly packed. 
Bit 1 Is llf there is an tnter-tag gap In the line dimen- 
slon, and 0 If tags are tightly packed 



Tag scale tactor in Y directwn. Output lines to the TFU 
will be generated YScale times. 



Determines far the firsVeven (0) and second/odd (1) 
rows of tags the number of dotpairs remaining minus 
1 . In either the tag or inter-tag gap at the start of the 
line. 



Determines Ibr the first^even and second/odd rows of 
tags how many tags are present in a Une (equals 
number of tags minus 1). 



1 OxCO 


NextBandStartTagOa- 
taAdr 

(64-bft aligned DRAM 
address > shoukj start at 
a 256-bit aligned loca- 
tion) 






Holds the value of StarfTagOataAdr for the next band 
Thts value Is copied to StartTagOataAdr when 
DoneBand Is 1 and NextBandEnable is 1 . or when Go 
transitions from 0 to 1. 


0xC4 
0xC8 


NextBandEndOfTagOata 
(64-bit aligned DRAIVI 
address) 






Holds the value of EndOfTagData for the next band 
This value is copied to EndOfTagData when 
DoneBand Is 1 and NextBandEnable Is 1. or when Go 
transitions from 0 to 1. 


NextBandRrstTagUne- 
i^eight 


9 


0 


Holds the value of RrstTagUneHelght for the next 
band. This value is copied to FiretTagUneHeight when 
DoneBand gets is 1 and NextBandEnable Is 1 » or 
when Go Uansitions from 0 to 1 . 


OxCC 
[Read-onlybi 


NextBandEnabfe 
tnd related registers 






When NextBandEnable Is 1 and DoneBand is 1 , then 
when te_ftnlshedband Is set at the end of a band* 
-NextBandStartTagDataAdr Is copied to StartTaaDa- 
taAdr 

-NextBandEndOfTagData Is copied to EndOfTagData 

-NextBandFirstTagUneHeight is copied to HrsfTa- 

gUneHeight 

-DoneBand is cleared 

-NextBandEnabfe is cleared. 

NextBantiEnabte is cleared when Go is asserted. 
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■A 














1 


u 


5>pecffies whether the tag data interface has finished 
loading all the tag data for the band. 
It is cleared to 0 when Go transitions from 0 to 1. 
When the tag data Interface has finished loading* all 
the tao data for the hanri tha .i.^ ■ 

^ «»i wAws iwi uio uaiia, me <o_»rTISrl800anCr Signal 

Is given out and the OoneBand flag Is set. 
If NextBandEnable isl at this time then sl^rfT^gOa- 
taAdc endOfTagData and ffrstTagtineHefghtare 
updated with the values for th» novt Kam^ ^^.j 
OoneBand is cleared. Processing of the next band 
starts immediateiy 

If NextBandEnabte is 0 then the remainder of the TE 

will continue to run while thA r^nti /v^nfmi .i-u ...^i.^ 
f«.w lull,, vTiiiio um reaa conuXM unit waits 

for NextBandSnable to be set before it restarts. Read 
only. 


f 0x04 


StartTagDataAdr 
(64-bit aligned DRAM 
address - should start at 
a 256-bit aligned loca- 
tion) 


19 


0 


The start address of the current row of raw tag data 
This Is tnltrally points to the first word of the band's tag 
data, which should be aligned to a 1 28-bit boundary 
(I.e. the lower bit of this address should be 0). Read 
only. 


1 0xD8 


EndOfTagData 
(64.Wt aligned DRAM 
address) 


19 


0 


Pblnts to the address ol the final tag for the band. 
When all the tag data up to and including address 
endOfTagData heis been read in, the tejinishedband 

srmial is given and the donAfinnri Aan ■» b^..^ 
9 w ifivvif uuiitsocuia iiBg IS set. rtoao 

only. 


1 OxOC 

1 Wbrkreglsti 


FirstTagUneHeight 
»rs (set before starting the 


9 

TE and rr 


0 

lust not l> 


The number of lines minus 1 in the first tag encoun- 
tered in this band. This wfll be equal to TagMaxUne If 
the band starts at a tag boundary. Read only. 


j 0x100 


UnelnTag 


1 


0 


e touched between bands) 

Determines whether or not the first line of the page is 
in a line of tags or in an inter^tag gap. 
1 • In a tag» 0 - In an inter-tag gap 


1 0x104 


LinePos 


14 


0 


The number of lines remaining minus 1 , in either the 
tag or the inter-tag gap In at the start of the page. 


1 0x1 10 to 
1 0x1 1C 


TagData 


4x32 


0 


This 128 bit register must be set up initially with the 
fixed data record for the page. This Is either the lower 
40 (or 56) bits (and the er?code/=&(ec/ register should 
be set), or the lower 1 20 bits (and enoodedRxed 
should be dear). The tagData{0} register contains the 
lower 32 bits and the tagDataOJ register contains the 
upper 32 bits. 

This register is used throughout the tag encoding 
process to hold the next tag's variable data 


1 Work registei 
j Read*ontyfra 


s (set Internally) 

m tho point of view of PCU register access 


1 0x140 
I 0x144 


DotPos 

CurrTagPlaneAdr 


14 
14 


0 
0 


Defines the number of dotpairs remaining in either the 
tag or Inter-tag gap. Does not need to be setup. 


1 0x148 


DotsfnTag 


1 


0 


The dot-pair number being generated. 

Determines whether the current dot pair is in a tag or 
not 

1 - in a tag, 0 - In an inter-tag gap. 
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Tabte 126. TE Conflguratton Registers 













0x14C 


TagAltSense 


1 


0 


Determines whether the production of output dots is 
tor the first (and subsequent even) or second (and 
subsequent odd) row of tags. 


0x1 S4 


CufrTFSAdr{64-btt 
aligned ORAM address) 


19 


Q 


Points to the start next line of the TFS to be read in. 


0x158 


ReadsRemaJnlng 


4 


0 


Number ot reads remaining In the current burst from 
the raw tag data Interface 


0x1 5C 


CountX 


8 


0 


The number of tags remaining to be read (minus 1 ) by 
the raw tag data interface lor the current line. 


0x160 


CountY 


9 


0 


The number of times (minus 1) the tag data for the 
current line of tags needs to be read in by the raw tag 
data Interface. 


0x164 


RtdTagSense 


1 


0 


Determines whether the raw tag data interface is cur- 
rently reading even rows of tags (=0) or odd rows of 
fags (=1 ) with respect to the start of the page. Note 
that this can be different from tagAitSense since the 
raw tag data Interface is reading ahead of the produc- 
tton of dots. 


0x168 


RawTagDataAdr 
(644^it afigned ORAM 
address) 


19 


0 


The current read address within the unenooded raw 
tag data. 



by including v^te deciders in thrsubTbl^c^^^^^ L the tn T i -b-blocks. Thi. is achieved 

reads the sul^block registers sr. fed tVthTSpl^^^^^^^^ '"^^ 1-^-™ 

accessible TE registers. ^ ^^^^"^^ " on all the PCU 



control 
pcu_dataout[31:0] . 






read 
^decode 



_sut>-block 



top level 



te_pcu_data[n[3t:0] 



te_pcu_rdy 



Figure 153. Block diagram of PCU accesses 
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26.6.5, 1 Starting the TE and restarting the TE between bands 

The TE must be started after the TFU. 



For the first band of data, users set up NextBandStartTagDataAdr, NextBandEndTagData and NextBand- 
FirstTagLineHeight as well as other TE configuration registers. Users then set the TE^s Go bit to start pro- 
cessing of the band. When the tag data for the band has finished being decoded, the tejinisliedband 
mtermpt will be sent to the PCU and ICU indicating that the memory associated with the first band is now 
free. Processmg can now start on the next band of tag data. 

In order to process the next band NextBandStartTagDataAdr, NextBandEndTagData and NexiBandFirst- 
TagLineHeight need to be updated before writing a I to NextBandEnable. There are 4 mechanisms for 
restartmg the TE between bands: 

a. tejinishedband causes an interrupt to the CPU. The TE wUl have set its DoneBand bit The 
CPU reprograms the NextBandStartTagDataAdr, NextBandEndTagData and NextBandFirstTa- 
gLineHeight registers, and sets NextBandEnable to restart the TE. 

b. The CPU programs the TE's NextBandStartTagDataAdr, NextBandEndTagData and NextBand- 
FirstTagLineHeight registers and sets the NextBandEnable flag before the end of the current 
band. At the end of the current band the TE sets DoneBand, As NextBandEnable is already I 
the TE starts processing the next band immediately. ' 

"^^^f x^xx^ programmed so that tejinishedband triggers the PCU to execute commands from 
DRAM to reprogram the NextBandStartTagDataAdr, NextBandEndTagData and Next- 
BandFirstTagUneHeight registers and set the NextBandEnable bit to start the TE processing 
the next band. The advantage of this scheme is that the CPU could process band headers in 
advance and store the band commands in DRAM ready for execution. 

d.This is a combination of and c above. The PCU (rather than the CPU in b) programs the TE*s 
NextBandStartTagDataAdK NextBandEndTagData and NextBandRrstTagLineHeight registers 
and sets the NextBandEnable bit before the end of the current band. At the end of the current 
band the TE sets DoneBand and pulses tejinishedband. As NextBandEnable is already 1. the 
TE starts processing the next band immediately Simultaneously, te Jinishedband triggers the 
PCU to fetch commands from DRAM. The TE will have restarted by the time the PCU has 
fetched commands from DRAM. The PCU commands program the TE next band shadow reg- 
isters and sets the NextBandEnable bit. 

After the first tag on the page, all bands have their first tag start at the top i.e. NextBandFirstTagLineHeight 
- TagMaxLine, Therefore the same value of NextBandFirstTagLineHeight will normaUy be used for all 
bands. Certainly, NextBandFirstTagLineHeight should not need to change after the second time it is pro- 
grammed. ^ 
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I 26.6.6 TE Top Level FSM 

The following diagram illustrates the states in the FSM. 

Reset ORno«=in 

i 



Go— I 



MTagDotLii 



while mmhicing valiri tay 



Figure 154. Tag Encoder Top-Level FSM 

At the highest level, the TE state machine steps through the output Unes of a page one line at a time with 
the starting position either m an inter-tag gap (signal dotsintag = 0) or in a tag (signals tfsvalid and tdvtUid 
aadlinemtag = I) (a SoPEC may be only printing part of a tag due to multiple SoPECs printing a single 
une^v 

If the current position is within an inter-tag gap. an output of 0 is generated If the current position is 
within a tag, the tag fomnat stnicture is used to deteiminc the value of the output dot, using the appropriate 
encoded data bit from the fixed or variable data buffers as neccssaiy. The TE then advances along the line 
of dots, moving through tags and inter-tag gaps according to the tag placement parameters 



Table 1 27 highlights the signals used within the FSM, 
Table 127. Signals used within TE top level FSM 




pdk 


Sync dock used to register all data within the FSM 


prst.n. te.reset 


Reset signals 


advtagline 


1 cycles pufse tndlcating to TDl and TFS sub-t>k>cks to move onto the next line of 

Tag data 


currdot]]neadf(13:0] 


Address counter starting 2 pclk ahead of cunrtagplaneadr to generate the correct 
dotpair for the current fine 


dotpos 


Counter to Mentify how many dotpairs wide the tag/gap is 


dotsintag 


Signal klentifying whether the dotpair are in a tag(1 Voap(0} 


nnelntag^temp 


Identical to lineintag but generated l pdk earlier 


nnepos_8hadow 


Shadow register (or Unepos due to Knepos being written to by 2 different proc- 
esses 


talaltsense 


Flag which alternates between tag/gap lines 


te_state 


FSM state variable 


teptanebuf 


6-bit shift register used to formal dotpairs into a byte for the TFU 


wradvtine 


Advance fine signal strobed when the last byte In a line is placed on te tfu wdata 
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Due to the 2 sysiem clock delay in the TFS (both Table A and Table B outputs are registered) the TE FSM 
J workmg 2 system cloc^ cycles AHEAD of the logic genexating the wriJe data fXcmj ^^r^r 
the following control signals had to be single/double registered on the ^^te«c/ocA. 



dotsintag • 
tdvalid • 
tfsvalid - 
tfu_ok_write - 
lineintag^temp - 



dotsintag] 


tdvalidl 


p 


tfsvalidl 


► 


tfti_ok_writel 


► 



A. 



pclk 



lineintagl 



-►dotsintag! 
-►tdvalid2 
-♦•tfsvalid2 
-►tfu_ok_write2 



Figure 155. Generated Control Signals 

The tag_dotJine state can be broken down into 3 different stages. 

S oJSotS/fff'VS'r ? T"^^ ^^'^^"^e active. TTus state controls the 

wntog of dotbytes to the TFU. As long as the tag line buffer address is not equal to the dotpatrsnerline 
Cf and (/i'-'e.o.fao.vnvc is active, and there is valid TFS and TD arable or CSTtoS^ 

phed to the TFU since the TFU is a FIFO rather than the line store used in PECl . ^ °" 

Wule generating the dotline of a tag/gap line {lineintag flag = 1) the dot position counter do^os is decre- 
mented/reloaded (with tagmaxdotpairs or taggapdot) as the TE moves between tags/gaps SoSi 

Stage2:. At this point the end of a dot line is reached so it is time to decrement the linepos counter if still in 

SearT^f oXZ T ^'^ -^^^ '^s r^iste^ islpdfteJ a 

rows when 17^7 IZ ? "^"^^ ^"^J*" between dot lines and tag 

rows when ifo/po^ and Unepos counters reach zero i.c when dotpos = 0 the end of a tae/uan has been 
«ached..when /«epo. = 0 the end of a tag row is reached. This sUge uses the li^^ l^fj^ X^S 
tagaltsense which were generated one system clock cycle earlier in Stage 1. and 



ttf LSBs^J'^T implements the writing of dotpairs to the correct part of the 6-bit shift register based on 
the LSBs of cumagplaneadr and also unplements the counter for the currtagplaneadr The cumazDla- 
foT^i;T« r --"^"S --'-SP'^-f i'iotpairsperline - 1). All the qu^ifier si^2\^7otsTJ^ag 
for this stage are delayed by 2 system clock cycles i.e. the currtagplaneadr (which is the interJl wrhf 
address not needed by the TFU) cannot be incremented until the dS>air. are Slable wWch S^wrys^^ 
system clock cycles later than when currdotlineadr is incremented. ^ 
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The wradvline and advtagline pulses are generated using the same logic (cuircntly separated in the PECl 
Tag Encoder VHDL for clarity). Both of these pulses used to update further registers hence the reason they 
do not use the delayed by 2 system clock cycle qualiiieis. 

26.6.7 Combinational Logic 

The TDI is responsible for providing the information data for a tag while the TFSI is responsible for decid- 
ing whether a particular dot on the tag should be printed as background pattern or tag information. Eveiy 
dot within a tag's boundary is either an information dot or part of the background pattern. 



TDI 



tdLetdO 



tdi etd1 



tdi_tagispflnted 



TFS 
Interface 



tfsLta_dot0[1] 



tfel_ta_dotoro] 



tf$i_ta^ctot1[1] 



tfsi_ta,dot1[01 



Cm 



l^dotsCO] 



|^clots[1] 



dotsintag 

Figure 156. Logic to combine dot infomiation and Encoded Data 

The resulting lines of dots are stored in the TFU. 

The TFSI reads one Tag Line Structure (TLS) from the DIU for every dot line of tags. Depending on the 
current printing position within the tag (indicated by the signal tagdotnum), the TFS interface outputs dot 
information for two dots and if necessary the corresponding read addresses for encoded tag data. The read 
address are supplied to the TDI which outputs the corresponding data values. 

These data values (jdi^etdO and tdi^etdl) are then combined with the dot information {tfsijiaJLotO and 
tfsi_ta_dotV) to produce the dot values that will actually be printed on the page {dots\ see Figure 1 56. 



lastdotintagl 



dotsintag 
tfsvalid^ 
t dvalrd 

dQtQfiS. 




dotpairsperjine 



Figure 157. Generation of Lastdotintag/1 

The signal lastdotintag is generated by checking that the dots are in a tag {dotsintag = 1) and that the dot- 
position counter dotpos is equal to zero. It is also used by the TFS to load the index address register with 
zeros at the end of a tag as this is always the starting index when going from one tag to the next, lastdotin- 



Doc; SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 382 



SoPEC : Hardware Design 



S5 



'^^Lf'^^t'^'^''^""V'' "^^^ ^'^^'^ ^^^'^ adv_tfsjine pulse is used to update the Table C 
address reg for the new tag line - this is because lastdotintag occurs a cycle earli^Aan ^ST^ i u- u 
would result in the wn,ng Table C value for the last do^^rl7^:X^^^V-Zr^U 
(etd_sw.tch state) to pulse the «./_«^a^ signal hence s Etching buffers if tlfe CTD^r the n:xy^ 

TTie signal lastdotintagl is identical Xo.lastdotintag except it is combinatorially generated (\ cvcle «.H,>r 
i^U^dcnntag, except at the end of a tagline). lastdoLagl signal is oXl^^^^il^'^^^^Z 
tdvahd signal on the cycle when dotpos = 0. Note the UNSIGNiD(c«m/or/3H = Lns^?S2 

/arrrfori>ito^_^en process as this is an combinatorial process. y as m tne 



dotsintagi 

tfftvaltrll ^ 



JdvailiU 



lineintaal 



tQ tibi rtktnwritol 



1 




dotposvalid 



Figure 158. Generation of Dot PosiUon VaJid 

Sltrri^'n '^^"'"^ ^"""^ ^^^"6 in a tag line (lineintagl = 1). dots being in a tae 

Se to Toad'^e tS n'tT ^ ""^'-"^^ " ""^ i /Jsig^ii u^dT^ 



dotsintag 
tfsvafid2 
tdvalid^ 
currtagplaneadr 




^ te„tft2_we 



^ te_tlbi_wradr 



Figure 159. Generation of write enable to the TFU 
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The Signal te^tju^wdatavalid can only be active if in a taggap or if valid tag data is available {tdvalid2 and 
tfsvalidi) and the currtagpplaneadriy :0) equal 1 1 i.e. a byte of data has been generated by combining four 
dotpairs. 



tagmaxdotpairs 
► 




tagdotnum 


a 




► 



j dotpos 

Figure 160. Generation of Tag Dot Number 

The signal tagdotnum tells the TFS how many dotpairs remain in a tag/gap. It is calculated by subtracting 
the value in the dotpos counter from the value progranuned in the tagmaxdotpairs register. 
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26.7 Tag Data Interface (TDi) 

26.7.1 I/O Specificatfon 



i3 



Table 128. TDi Port List 



gnal 



IS 



crocks and Resets 



pdk 



In 



prst_n 



DiU Read rmefface s rgnals 

diu_clata[63.-01 



td_diu_iTeq 



<cr^cfiu,fadf(21:S3 
dlu_td_fack 



diu_td_rvaiid 



In 



Out 



Out 



In 



In 



PCU Interface Data, Controi Signals and 



pcu^dataout[31:0) 



pcu,addr[8:2J 



pcu_rwn 



pco_te_8el 



pcu_te_reset 



td^te.donaband 
td.te.dataredun 
tOe_d6code2den 
td_te_varfabledatapres6nt 
td_te_encodefixed 
td_te_numtagsO 
td_te_nurntags1 
td_te_starltagdataadr 
td.te^rawtaQdataadr 
^.te_endoftagdata 
td_te_firsttag«neh©lght 
td_te_tagdataO 
td_te_tagdata1 
td_te_tagdata2 
tdJe^tagdataS 
td_te_countx 
td_te_oounty 
td_te_rtdtagsense 
_td^te_f eadsre n^f nlng 



TFS (Tag Format Structure) 



In 



Out 



SoPEC system dock 



Adlve-tew, synchronous reset in pdk domain. 



Data from DRAM, 



Data request to DRAM. 



Read address to ORAM. 



Data acknowledge from DRAM. 



Data valfd s/gnal from DRAM. 



PCU writes this data. 



PCU accesses this address. 



GkPbai read/write-not signal from PCU. 



PCU selects TE for iNt acce ss. 
PCU reset 



PCU readable registers. 



tfsLadfO(8 :0] 
tfsi_adr1[6:0] 



Bandstore Signals 



cdu_startofbandstore[24 :0J 



cdu_endofban dstore[24:0} 
te^finishedband 



In 



In 



Read addres s for dotO 
Read address for doti 



In 



In 

Out 



Start memory area allocated for page bands 



Last address of the memory altocated for page bands 



Tag encoder band finished 
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(e.Gnishedband 




eWRdAdfO 



etdRdAdrl 



ieveO.nriux • 



RScontrol 
dataRedun ' 



^28 >28 



dataRedun ^P^ y 




salLow 



Figure 161. TOI Architecture 



^ tdVand 
fastOotlnTag 
lastDotln1^g1 



laglsPrinted 



> etdO 
^ etdl 



26.7.2 Introduction 



'""'^"ff responsible for obtaining the raw tag data and encoding it as required by the tag 
dotelldde placement is 2mni x 2mm. which means a tag is at least 126 1600 dpi 

Ln^H^ld° 7fZ ""^"^''^ 'j! ^ 2 ^y^^^' data interface has 

been designed to be capable of encoding a tag in 63 cycles. This is actually accomplished in approximately 
52 cycles withm PECl. For SoPEC the TE need only produce one dot per cycle; it should be able to pro- 
duce tags m no more than twice the time taken by the PECl TE. Moreover, any change in implementation 
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from two dots to one dot per cycle should not lose the 63/52 cycle perfonnance edge attained in the PEC 1 

As shown in Figure 1 62, the tag data interface contains a raw taa data interface F9M »i„t fi.»^K . ^ 
from DRAM, two symbol-at-a-.in.e GF(2^ Reed-Solomon ^^e^T nS Jl^tlJcfa^^^^ 

hTd^^edTSoXit^''"^-''^"-"^ 

'2:X^'^Z'^^ "^""^ '^"""^ TE.datareaun and TE_dec<^2den 

' "''^^^ ^"P"' ^ "^^^ '° P"^"« 15 output symbols, so the output is 

3 times the size of the input. This can be performed on fixed and variable tag data ''''<»"•P""^ 
• (15,7) RS coding, where every 7 input symbols are used to produce 15 output symbols so for the same 

26.7.6 on page 400). This can be performed on fixed and variable tag data 
' andtSjfe ^ dm"' ' ^"^ ""^"^ ' '^^^ ™^ ^^"-^ on fixed 
' Sx^"da"ta o"'r ^ " """^ ^ ™^ ''^ °" 

d^t(Tf.1inrerConr^er'^"^^^^^'"^^"^ 

r^Sc*"l*'"!?!f« ^ ^ 120-bits when it is already coded (or no coding is required^ 

i^oLl?t^SL'tn?;^T'"' " j^-'v^ "•^^^ ^''-'^ once dJe i;,^'.^^- 

is coded It IS 120-bits long. It IS then stored m the Encoded Tag Data Interface. 

^tl* When (15.5) coding is required the 120- 

biu Stored m DRAM axe encoded into 360-bits. When (15,7) coding s r;quired. L 1 12 bS'sSed to 
v^d inr^To vf'r 2D decoding is required the^20-bits sto;ed Ib DR^^ Ztn- 

verted mto 240-b.ts. In each case the encoded bits are stored in the Encoded Tag Data InterfacT 

The encoded fixed and variable tag data are eventually used to print the tag. 

1.1. ^ 8xl5-bits registers/RAMs in the Encoded Tag Data Interface This data remains 

unchanged m the registers/RAMs unril the next page is ready to be pro^ssed. 

data for each tag is stored in four 32.bit words The TE re-reads 

data FIFO which reads from DRAM has enough space to store 4 Ugs vanaoie tag 
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I 26.7.3 Data Flow 

An overview of the dataflow through the TDI can be seen in Figure 162 below. 



RAW TAG DATA INTCRIiftCE 



ENCODED TAQ DATA tKTERFACE 

-encoded fbcAd data can be Up to 120 bito lono 
*Usa 2 tMfte(9 10 anew tor 2 sMianeoualy 
READS In onacyde. 

<Thes« stores hold iho flxed tag data tor 1 tag. 
-Tocalmemofy 120x2 -240 bits 



TAQ DATA REGISTER 




REEDSOLOMOItf 
DECODE 20 



•The requested lag is READ 
ino Ms l2B-bit buffer. 
-TNs buTfarcan be updated 
up 10 1 83 tm)e&/Sne. 
<Each tag wriu be loaded 
ac least 126 times. 



^min doiAag 126 (spedned) 
•max dotsAtna • 1600x12.8 = 20480 
-max taos/One • 20460^126 » 163 
-max variatsfe data^tag - 120 
•max amount of tag data/l{ne ° 120 x 164 
•SpBt the 120 tag tfata bits ^to 2x64-bit9 (6 spare t^) 
-Max memory needed lor 1 fine ai tag data ■ 2x&4xl64 ■ 656Jd2 
•OvideiWacnhalfioaSowtoralnwtianeous REAOAVRITE 
•On^ an this data is loaded It WW be valid tor aiieast 126dnes. 
"^I!?^ to p«xesa 2 dOtsA»^ 

-126 Does contains 20460x126 -2580480 dots. ''^^'^ 
•Theretore the data wUI be updated at most every 1 290240 cydes. 
-TotaJ memory . 164x2x64 • 20992-bitS 
-T>»a«te«t«es 1«xt addressing. Bft-9 indcales wtiter* buffer. 

I.e.toral2^fcichllneilhasl0240dot8orsi20cvo*es 
tor an 8 Inch fine K has 6400 dou or 3200 cycles 




-Have to be able to read one tag^ data 
from ttie Haw Tag Data Interlace. RS 
encode and store it in the Encoded Tag 
Oata Intertooe In 63 cycles or less. 



-Encoded vsrlabla data can be up to 360 bits tong 
•Use 2 buffers to aflow tor 2 simultaneously 
REAOsin onecycla. 

-Use 2 buffers to aUoMT tor simutaneousfy 
REAO^WRITE 

•Ibtal memory « 360x2x2 « 1260 bits 
-Mkiiag «Mdth « 126 dots 

so the fastest that 1 tagcanberead- i20/2«63cvaes 



Figure 162. Data Flow Through the TDI 

The TD interface consists of the following main sections: 

• the Raw Tag Data Interface • fetches tag data from DRAM; 

• the tag data register; 

• 2 Reed Solomon encoders - each encodes one 4-bit symbol at a time; 

• the Encoded Tag Data Interface - supplies encoded tag data for output- 

• Two 2D decoders. * 
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26.7.4 



Raw tag data interface 

The raw tag data interface (RTDI) provides a simple means of accessing raw tag data in DRAM The RTDI 
passes tag data mto a FIFO where it can be subsequently r^ad as required. Ihe 64-bit ^ from™ 

^I^vTn^T^^^^^^^ ti*"' ^'^^ being iSed to set/reset as tl^e e^nalle sTg^^ 

(rtdAvail)^ The FIFO is clocked out with receipt of an rtdRd signal from the TS FSM. 

Figure 163 shows a block diagram of the raw tag data interface. 

I DRAM InterfBoe 



raw tag data 
Intarfaca 



raw tag data 
RFO 



k dlu.data|63:0] 




wrptr 




rtd_fffo_wr_en 




rdptr 




pctk 






rtdbuft64:0] 



'17 



rtd state 
mactitne 



le.finishedband 



fift>.wr_en 



ftdbufI63:Q] 



(r rtdbuf data registered in Tag Data Reg) 



pclk , 



fifo_wr_en 
diu.td. 



Lrvalid 1 / 




Figure 163. FUnw tag data interface blocic diagram 



26.7.4.1 RTDI FSM 



^ll^!. t " «''!P°'^'*''« ^^P^^e, the raw tag FIFO fiill. The state machine reads the line 

ZTfi^ u' P™"''»« "ses the tag. This means a given line of tag data will be read at least 
^^erTh2,'lTt^ "^.'f ^ "^^^ «hat *e first line of tag data may be read 

LTv clnSn^iT*?"''*^' T ^ « '"^'^tio" <x" e^e- rows of tags 

may contain different numbers of tags. ^ 
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register before starting the TE by isSSg C^^ "'^ -"mra^^Ay. «««7a^./ 

and NextBandEnable si See SecJon 26T5 , for?^ '""'^ ^ of t^n>ws) 

the TE between bands. * *«^Pf on of the four ways of reprogi^u^ 

iSL?63ts1rT^ri2,^7^T^^^^^ 

The RTDI Sute Flow diag^ is shown in Figure ,64. An expIanaUon of the «ates follows- 

Counter counts is Signed the nZ^rot^^s '^l^,:,^. ""'^^^^^ ^ ^RAM (I • 256bits). 
ndtagsense. Down-counter «,„^ is assig^d t^e nlJ^rt^ Z T'^'^^^ on the value of registi 
fally it niust be set ^^firsnaguZetghtZ^^i^"^^^^^^ " ^« '26). Ini- 

mal tag generation co„«r, will take the ^^^^oTu^^l^^^^^^"^ * P^^ For nor- 

^!5S^™dit;d^r*^^^^^ 

of the FIFO. As long as lr_rd_couZ7i^^^^l I '"^^n'^'od/decremented on writes/ieads 
control signal called td_diu_radn>alid is l^r^fLZ?? . »"ej') *ere must be 4 locations free. A 

are sent in bursts of 1. T^io^mJi^st^'Z^lZTx^T^, I'^u "^"^^ 
TE Code.) controls this signal, (wUl mvolve modification to existing 

Singti:i^T2^;^^^^^ 

^-^^;r^:S~i?;a1S^^^^^^^ 

dots for this row are complete When Ju^tT^* ^i"^' ^ it means all tag 

scaling) for this row of ^Tv^nVZtf t^^^'c^J^^^ " f °^^*>^ (P"- 'o Y 

even). The rav^agdataa^is comparedTtheS .^.^ 5 of «<//a^.,«„e is inverted (odd/ 

the doneband si^ is set. thS^^W si^-al^^^^ l'^^^' ''''''''SdcUaadr ^ endoftagdata 
the rfo„.A W siSl is res; to S bjtTpcTbv wWct?- ''^^ '""^^^ "«til 

£/meA«:gA, registers are setup with new valu^ to re^l tL TP .^JT'^^f^*. ""doftagedata ^nAJirstta. 
from the DIU. Each rime diu id rw./J7s yeh^S^^ " "^'^ *° ^'^ ^-^it reads 

rtrf_^^ra_«,M„, = rtd num is nec^S^^ o 3 « mcremented by 1. The compare of 

n*64-bit data (dependTn^n aS^^Si^I " ? '^^^ or 

values being returned by the DIU. " endojiagdata in the middle of a set of 4*64-bit 

o me laie state. This states also performs the same count on the 
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Variabfedatanresent = " | 



— ►(oi^ 



flQ°='1 ANDwf rri rjotintpf ^ 



end of 



ACCESS 



S) 



LOAD 



0 



donebanri 



donehan^ == 1 
^ (}^P.STALL^ 



address 
fncreasing 



Figure 164. RTDI State Flow Diagram 



DRAM addresses 




bandN VI * 



cdM_siartofband$tore 



T^_endoftagdata (for band N) 



TE.endoftagdata (for band N*1) 
cdu.endofbartdstore 
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26.7.5 TOI state machine 



The tag data state machine has two processing phases. The first processing phase is to encode the fixed tag 
data stored m the 128-bit (2 x 64-bit) tag data register. The second is to encode tag data as it is required by 
the tag encoder. ' 

When the Tag Encoder is started up, the fixed tag data is already preloaded in the 128 bit tag data record If 
^JjCodeFixed is set, then the 2 codewords stored in the lower bits of the tag data record need to be encoded- 
40 bits if datoRedun - 0, and 56 bits MdatoRedun = 1. If encodeFixed is clear, then the lower 120 bits of 
the tag data record must be passed to the encoded tag data interfece without being encoded 
When encodeFixed is set, the symbols derived from codeword 0 are written to codeword 6 and the sym- 
bols derived from codeword I are written to codeword 7. The data symbols are stored first and then the 
remaimng redundancy symbols are stored afterwards, for a total of 15 symbols. Thus, when dataRedun = 
0. the 5 symbols derived from bits 0-19 are written to symbols 0-4, and the redundancy symbols are writ- 
ten to symbols 5-14. When dataRedun - 1, the 7 symbols derived fiom bits 0-27 are written to symbols 0. 
6, and the redundancy symbols are written to symbols 7-14. 

When encodeFixed is clear, the 120 bits of fixed data is copied directly to codewords 6 and 7. 
The TDI State Flow diagram is shown in Figure 166. An explanation of the states follows. 



vartiitifertBt«nrr«rnt r^, 0 



CTCnflffflXPrt »° T ANn fratanftdim — n and tU>rM^Hpn «, f 



^ — 1 ANDdaiamriiiri — n amp ^ ^ 

mm 1 AMPrtSft 



1 ANn <lwrrtftegrtow — n 



docode_ad_l5.7 




datareriun ^ O AND ^Qd<>2den« » O 



Ioad_tag_dala) 



Figure 166. TDi State Flow Diagram 

idie> In the idle state wait for the tag encoder signal - top^o = 1. The fii^t task is to either store or 
encode the Fixed data. Once the Fixed da ta is stored or encoded/stored the donefixed flag is set. If there is 
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and the FSM steps onto the .oadtagdfia s^Z^JytZZ'<^^l^C^X:tt^^^^^ 
tag_data[l 28:64], second 64-bits of rawtag data are assigned to 

■ ™ "»^'«''«'»»«««»' register » Ihen polled m see if there U vufaWe La. to the ti«s If «S 

Encoder e^ co„,,„tSL^r.^22^r^t.t.r(.«rJ^' 

eyele counter of He RS Encotler. Tbs logic cycles for . total of 3- 15 cycles to encotle the l2(Writs 
Sis'^is?.^?:' ">"" level Lm», h« to sdect 7 4.bit sytn. 



SSltrz^-""*""-*'"" S3 proprietary Doc„„„, asNovaooa 

Page 393 



SoPEC : Hardware Design 



SI 



decoder are combined and stored in the ETDi. Next the 2 MSBs are decoded to create 4 bits. Again the 4 
bits trom each decoder are combined and stored in the ETDi. 

^aT dV'^'' ^^"^ ^'^^ P*^*' ^ of muxing between the Tag Data register 

and the RS encoders or 2D decoders. Levels 1-2 are controlled by level I mux and /ev./2 wS^^^^ 
generated within the TDi FSM as is the write address to the ETDi buffers (etd^wr^adr) 

ETDlb^l^''^^ ^ ''^"^ "^^^^^ """^ ^°«>^d variable tag data in the 
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127126.. 
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1 


( 






79 78.. 


..6t6C 




• 




99 98.. 


..8t8C 








119118. .101 IOC 



T^tagdata(t t9.«) T^.lagdaa0 19:0) 



d4 d3 d2 d2 dp 



TE_taBdata(llQ:0) 




P> P6 Py P6 Ps P4 Pa ft2 Pi Po d4d3d2d,do 



dO to d9 are encoded and stored 
during cydes N lo N<f14 



Pi9PiaPi7Pi6 PisPm P13P12 P11 Piod9d8<>7<%<% 

' wradr(5:0} 



^i4di3 dt2dti d|p 



diodted^ydtedts 



^^^^ ^ fiSO |P29P2BP27P26P2SP24P23p22Pg1 P20<Jl4«13 <<t2 dn dtp 

P39 P38 P37 P3eP3S P34 P33 P32 P31 PSO <^^a ^17 ^16 ^15 



dio to d19 are encoded and stored 
during cydos N+15 to N+29 



















^24 <*23 d22 d2i d20 




^ <*27 ^26 ^25 





codewofd 3 

_ codeword 2 - 



||1E 


P39P2g 


10 


P36P28 


1C 


P37P27 


1B 
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1A 


P35P2S 


19 


P34 P24 


18 


P33 P23 


17 


P32P22 


16 


P3I P2I 


15 


P30P20 


14 


di9 di4 


13 


d,Q di3 


12 


di7^12 


11 


rfi6<*n 


10 


d^s dto 


t 




wradr(5:0) 



d20 to d29 are encoded and stored 
during cycles to N+44 




RSI 



{ P49 P<a P47 P46 P45 P44 P43 P42 P4I P40 <J24 <<23 ^22 <bl ^ 



P59PsaP57P56PsaPS4PS3P52PslP50<<29«i8ad27d26*^2S 



Figure 167. Mapping of the tag data to codewords 0-7 





P59P49 




Ps8P4a 




PS7P47 




P56P46 




PS5 P45 




P54P44 




P53P43 




P52P42 




Psi P41 




P50P40 




<*29<*24 












djo d2, 


^ 20 


d2S d20 


codewords*- — J T 
codeword 4 ' 
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wra(lr(5.'0} 




P6 Ps P4P3 Pap, PoriU<fad»ditfn 



Pi9 Pia Pi7 Pt6 Pfs Pi4 Pi3 Pia p, , p,o dfl dg dft ar 



do to dg are encoded and stored 
during cydesN to N+14 




codeword 7- 
oodsworde- 



Figure 168. Coding and n,appi„g of uneoded Rxed Tag Data for (15.5) RS encoder 



TE_tagdata(1 10:0) 



^4 ^3 da d, dri 




dg de d7 dg ds 




<^i4<Ji3d,2 dndjo 




di9died,7d,6d,5 




<*24«<23d22d9,d9rt 









do to d29 are stored 
fuming cydes N to N+14 




codeword? 
codeword 6 



Figure 169. Mapping of preceded Fixed Tag Data 
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ORAM 




3129.. 
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8362.. 


-.3332 
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-.65 64 




127 128.. 
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halftagrmeori 



Te.ta8data<1ll.-0) TE^tag<Iata(ni:0) 



^ ~ ] curr_writB_adrfcwT_read,adr^ 

I J 







TE_lagdata{ 127:0) 


27 26 .. ,.10 




dfi ds d4 da d;» d2 do 


63 62.. ..10 




63 C2 . , ..10 




5554.. ..29 2£ 






127 128.. ..6^94 




127 126.. ..6564 




8362- ..57 56 






' • 
• 






111 110„ ..8584 







T^tagdatafliirO) 



^ <i4 <h ^2 ^ <iQ 




dia di2 d,| dioda da , 


\ 







dO to d13 are encoded and Stored 
during cydes N to 



j"^ I H P7 P6 Ps P4 P3 Pa Pi po dg d(< drdTdTdc 
PisPi4Pi3Pi2Pii P1QP9 Padiadi^dii dtod^dadr 



T£.tagdata(fii:0) 



^od|gd|a di7 d|e d|5 



<^<^2sd2sd24d23d2g da. 



P23 P22 P21 PaO P19 Pifl p,y Pis d|9 d|fl d,y diit d,,; dw 



— 



dl4 to d27 are encoded and stored 
during cycfes N4-I5 to N4^29 



codewords 
codeword 2 




Rgure 170. Coding and mapping of Variable Tag Data for (15.7) RS encoder 
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TE_tagdata(111:0) 



<% <<5 ^4 dj da da do 



di3 di2 d„ti|o<t9 da d7 



dO to d13 are encoded and stored 
during eycles N to N-f 14 



-» |rsq | -»[ 



PyP6P5P4PaP2PiPod^ ds d4 da dj d 



Pis Pu Pi3 Pi2 Pii Pio P9 pe dta d,2 d| , d^p da d^ dy 




codeword7- 
codewofde • 



Figure 171. Coding and mapping of uncoded Fixed Tag Data for (15,7) RS encoder 
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i» = tower 2-bils of symbol x 

Hjt = ^^ after 2D decoding (4-bits long) 
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Figure 172. Mapping of 2D decoded Variable Tag 
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26.7.6 Reed Solomon {RS) Encoder 

26.7.7 Introduction 

OnJy one type of RS coder is used at any particular time The R«!o~<-,» ^ ■ ^ 

registers TE_dataredun and TE_decode2dem to^e. TTie RS coder to be used is determined by the 

• r'P-^'««'ft"i = 0 and 72L«i«:«fe2rfc„-0. then use the (154) RS coder 

• 7''^-'*»^«'-^'/«" =l and 7S.<fec^>rf«2de«=0. then use the 05,7) RS coder 

wt^'^i^Slfat *?ol*^^^^^ P-^"ce 1 5 4-bit code- 

A simple block diagram can be seen in. 



' 2 k.| k 
rHM|4|i|g|3|<l - |H!JJJNmg|3i m ^ 



1 - 2 

RS (n,k) encoder ^-^ tmrmmam - - - 
symbo/ size m=4 



n-1 n 



Figure 173. Simple block diagram for an m=4 Reed Solomon Encoder 

26.7.8 I/O Specification 

A I/O diagram of the RS encoder can be seen in. 



pdk 



prst^n 



rs_<latajn(3;0j 



enable 



TH^dataredun 



need Sotomon encoder 



Figure 174. RS Encoder I/O diagram 

26.7.9 Proposed implementation 

In the case of the TE. (1 5.5) and (1 5.7) codes are to be used with 4.bits per symbol. 

The primitive polynomial is p(x) = x'* + x + 1 

In the case of the (1 5.5) code, this gives a generator polynomial of 

gW = (x+a)(x+a2)(x+a3Xx+a*)(x+a5)(x+a6)(xt^7j(^+a8)(x+a9)(x+a"') 



g(x) = x'O + aV + a'xS + a'x' + a«x« + a' V + aV + ax^ + a^x^ + ax + a'" 
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In the case of the (15.7) code, this gives a generator polynomial of 

h(x) = (x+aKx+a^Kx+aJXx+a-^x-HiSxx-Hi^Xx+a'Xx+a*) 

h{x) = x« + a»V + aV + aV + aV + a»x3+aV + a"x + a« 

h(x) = x8 + h7x' + hfiX* + hjX* + h4X* + hjx' + hjX^ + h,x + ho 

SpStZS ''"^""^ '"^•'"'^ «™ «to » Poly-o-ia. ^ade up 

This division is accompUshed using the circuit shown in Figure 175. 




AdefloM an muNpIer Uui 
■rnApBaa Oatols FMo denwMt 

®<*nMa an addor Oni 
>Ato oaioii FWd alameiw 



n.<l«i«Jn(3«) 



Flgun. 1 75. (15.5) & (15.7) RS Encoder block diagram 

in (15.5) mode con^U is always zero and when operating in (15.7) mode co«^L5 is 

Firstly consider (15.5) mode i.e. TE_flatdndun is set to zero. 
For each new set of 5 input symbols, processing is as follows: 

put(«_^r:„orei.e£S^^^^^^^ 

and in««^ by simply cloc^f^^J^^^' """^ ^''^^^ ^ ""tP^t '"'^^ 
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A timing diagram is shown below. 




dk 

ra^data.ln(3:0] 
r8.data^out(3.-0] 
rs^counter 



controL? 



Figure 176. (15.5) RS Encoder timing diagram 

Secondly consider (15,7) mode i.e. TE^dataredun is set to one 

A timing diagram is shown below. 




Figure 177. (15,7) RS Eneoder«mlng diagram 

The enable signal can be used to start/reset the counter and the shift registers 

between each codeword. ^ '"^'^ " "S^Jt there wiU be a delay 

output at a'rate-of 1 sy^^^c^ etn'ot a ^^^^^ ^ — --™'y 

Alternatively, the RS encoder can request data as it reqmres 

i'rd^Ts:;rratitfn;;r^^^^^ 

• encode the raw tag data 

• store the encoded tag data in the Encoded Tag Data Interface 
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26.7.10 Galois Field elements and their representation 

iiSS^j^lgt s^;'''"""'' " addition, subtn^ction. muldpiication and division 

The TE iises RS encoding over the Galois Field GF(2^). There are 2" elements in GF(2*) and thev ai« .en 
erated using the primitive polynomial p(x) = + x + 1 . ^ ^ * 

The 16 elements of GF(2'') can be represented in a number of dififeient ways. Table shows three possible 
representations -the power, polynomial and 4-tuple representation. snows mree possible 



Tabre129. GF(2*) representations 









u 


0 


(0 0 0 0) 


1 


1 


(1 00 0) 


a 


X 


(0 10 0) 


a* 


X2 


(00 10) 




x' 


(0001) 


a* 


1 +x 


(110 0) 


a» 


x+x^ 


(0 110) 


a» 


XUX^ 


(0 011) 


a' 


1+X +X3 


(1101) 


a" 


1 +X2 


(1010) 


~^ 


X +x^ 


(0 10 1) 


o'o 


l+X + X^ 


(1110) 


e" 


X + X2 + X5 


(0 111) 




1+X + X*+X' 


(1111) 




1 i-X^ + X' 


(1011) 


a'* 


.1 . +X3 


(1 001) 



26.7,11 Multrplication of GF(2*) efements 

The multiplication of two field elements cc* and a*' is defined as 
= a^a^ = Q(a-H))moduJo 1 5 

Thus 

So if we have the elements in exponential form, multiplication is simply a matter of modulo 15 addition. 
If the elements are in polynomial/tuple fonn. the polynomials must be multiplied and reduced mod + x 

Suppose we wish to multiply the two field elements in GF(2^): 
= ajx^ 4- ajx^ + a|X^ + a^ 
a''«=b3X^ + b2x2 + b,xUbo 
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where 24, bj are in the field (0,1) (i.e. modulo 2 arithmetic) 

Multiplying these out and using + x -1- 1 « 0 we get: 

a*""** - [(aoba + aibj + ajbi + ajbo) + ajbjjx^ 

+ C(aob2 + ajb, + ajbo) + ajbj + (ajbj + a2b3)]x2 
+ [(aob| + aibo) + (ajbj + ajba) + (a^bs + ajbj + a3b|)]x 
+ [(aobo + a,b3 + a2b2 + ajbj)] 
a*"** = faoba + aib2 + a2bi + aaCbo + b3)]x3 

+ [aob2 + a,bi + a2(bo + bj) + aaQjj + bj) Ix^ 

+ [aobi + a,(bo + bj) + a2(b2 + b^) + a3(b, + ba) ]x 

+ [aobo + aib3 + a2b2 + a3b|] 



If we wish to multiply an arbitrary field element by a fixed field element we get a more simple form. Sup- 
pose we wish to multiply a** by a'. 

In this case = x^ so (aO al a2 a3) - (0 0 0 1). Substituting this into the above equation gives 

CX^ =^ (bo + b3)x5 + (b2 + b3)x2 + (b, + b2)x + b, 
This can be implemented using simple XOR gates as shown in Figure 178 



J 



C3 

OR gate 



Figure 178. Circuit for muttiplying by 



26.7.12 Addition of GF{2*) elements 

If the elements are in their polynomial/tuple form, polynomials arc simply added. 
Suppose we wish to add the two field elements in GF(2^): 



a* = a3X^ + ajx^ + a,x + ao 

ri*> r= 1 



a° = b3X-* + b2X^-f-bix + bo 
where a^, bj are in the field (0,1) (i.e. modulo 2 arithmetic) 

a<= = a* + a«» «= (a3 + b3)xU (a2 + b2)x2 + (ai + b, )x + (ao + bo) 
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Again this can be implemented using simple XOR gates as shown in Figure I 



79 



^3 



UjJ 



«xc<ush« OR gate 



Rgure 179. Adding two field elements 



26.7.13 Reed Solomon Impfementation 



Consider the multiplication 



or in terms of polynomials 

t ^ ^ b,x2 b,x bo) = (C3x3 ^ c,X + Co) 

express in terais of a^, we get the table of 



If we substitute all of the possible field elements in for a* and 
results shown in Table 1 30. 



^" "^'^ express^! m terms of a*> 




the following signals are required; 
• bo,bi,b2. b3. 
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( bo+bj). (bo+bj). (bo+bj). (b,+b2). (bj+bj), (bj+bj). 
(bo+i)i+b2), (bo+bi+bj), (bo+b2+b3), (bj+bj+ba). 
(bo+bi+bj+bj) 



The RS encoder has 4 input Imes labelled 0.1.2 & 3 and 4 output lines labelled 0 1 2 & 1 Thi. i-k.ii 

ZTT^ 2 ^^V"'""""^ polynonual/4-tuple «p.2entaSn of 

from the TE,tagdata register into the RS is as follows: symoois 

- the LSB in the TE^tagdata is fed into UneO 

- the next most significant LSB is fed into linel 

- the next most significant LSB is fed into line2 

- the MSB is fed into line3 

^IS S^H^S^stsT" "^^"^^ ^ '^'^^ ^y-^^ - - 

- lineO is fed into the LSB (bit 0/4) 

- line! is fed into the next most significant LSB (bit 1/5) 

- Iine2 is fed into the next most significant LSB (bit 2/6) 

- Iinc3 is fed into the MSB (bit 3/7) 
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bf«>2 



b, 



SI 



bi*b2 
bz+ba 



bz^ba 

6b*b3 

bt 



bo*bi 
ba 
b3 



— •©•lAj — *$» 



controLS 



go(a'°) 91 (a) 92 (a«) 93(0) g^Ca^) 95 (a^^) g^^^sj (^^^S) g^(^3) g^^^z. 



I^*b,*b2*b5 
bi+bj-t^Ja 



bq-^b3 

bi 

b2 



i 



bi^bj 
bi*b3 
bo*b, 
bo+bfibg 



b3 
bo*b3 
bi 



b^-^to3 
bo*b3 
b, 



bo*bi 
bj 
bj 



b,*b2 
bi*b9 
bo^bj 
bo+b,+b3 



b,*b5 ( 
bo*bi+by^b3 
bi+by+ba 
ba*b2*b3 



4 



bi+bj 
bj+b, 
bo^ 



bj^b, 

bD+b3 
b, 



+ fixduslveORoate 
4.bU sNtt reQlstor 



rs.€&iaLin(3:0) 



Figure 180. RS Encoder Implementatton 



(9_d3ia_out(3:0) 
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26.7.14 2D Decoder 

The 2D decoder is selected when TE.decode2den = 1. It operates on variable tag data only its function is 
to convert 2-bits into 4-bits according to Table 131. «dia oniy. ics junction is 

Table 131. Operatfon of 2D decoder 







00 


000 1 


01 


00 1 0 


1 0 


0100 


1 1 


1000 



26.7,1 5 Encoded tag data interface 



The encoded tag data interface contains an encoded fixed tag data store interfece and an encoded variable 
tag data store interface, as shown in Figure 181. ^ «iu -a cncooeo vanabie 



datain 



rdAdrO 



idAdn 
wrAdr 



atfvTag 



encoded tag data intarface 



bits) 



-o- 



tdAdrl^ 



U (lowbtts) 



tag data 



.datain 6 
i 7^ 



outO V 



IdAdn. 



-7* ► 



evtdwe. 



encoded 
variabia 
tag data 



-o- 



datain 8 

I 7^ 

outO V 



advTag 



etdl 



etdO 



Figure 181. encoded tag data Interface 
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The two reord units simply reorder the 9 input bits to map low-order codewords into the bit selection com- 
ponent of the address as shown in Table 132. Reordering of write addresses is not necessary since the 
addresses are already in the correct format. 

Table 132. Reord unit 





m.mUt 










B 


select 1 of 8 codewords 


B 


seieci 1 Of <» coaewora laoies 




C 


D 


select 1 of 15 symbols 


D 


select 1 of 15 symbols 


E 






F 




F 


G 




G 


C 


select 1 of 8 bits 




H 


select 1 of 4 bits 


H 




1 


1 



The encoded fixed data interface is a single 15 x 8-bit RAM with 2 read ports and 1 write port. As it is only 
written to during page setup time (it is fixed for the duration of a page) there is no need for simultaneous 
read/write access. However the fixed data store must be capable of decoding two simultaneous reads in a 
single cycle.Figure 1 82 shows the implementation of the fixed data store. 



fdAdrO ' 



wrAdr ' 



8 



•needed fixed tag data Interface 



nbits) 



(15x8 bit) 



(ISxBbit) 



3(tobHs} 



-> outO 



-> outi 



Figure 182. encoded fixed tag data interface 

The encoded variable tag data interface is a double buffered 3 x 15 x 8-bit RAM with 2 read ports and 1 
write port The double buffering allows one tag's data to be read (two reads in a single cycle) while the 
next tag*s variable data is being stored. Write addressing is 6 bits: 2 bits of address for selecting I of 3, and 
4 bits of address for selecting 1 of 15. Read addressing is the same with the addition of 3 more address bits 
for selecting 1 of 8. 

Figure 183 shows the implementation of the encoded variable tag data store. Double buffering is imple> 
mented via two sub-buffers. Each time 2XLAdvTag pulse is received, the sense of which sub-buffer is being 
read from or written to changes. This is accomplished by a 1-bit flag called wrsbO. Although the initial 
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rdAdit)' 
idAdn. 
wrAtfr< 



evtcNv 



wrsbO 
(1 bit) 



I 



encoded varial>le tag data Intarfece 



variable 
tag data 
sub buffer 0 



variable 
tag data 
sub buffer 1 



1 

1/ 



1 

-7^ 



1 

■7^ 



Figure 183. Encoded variable tag data interface 



outO 



■> out1 



fdAdfO 



wrAdr 



ff dAdrl 



^3JlQ bHs) •"co^'' vaHabte tag data au b twiffaf 



01 6. 



6fhibas} ^ 
■V^^— ►oj 



(3KlSK8bit) 



HP 



(3xi5x6blt) 



^(iobits} 



outO 



outi 



Figure 184. Encoded variable tag data sub-buffer 
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26.8 Tag Format Structure (TFS) Interface 



26.8.1 Introduction 



TTie TFS specifies the contents of every dot position within a tags border i.e.: 

• is the dot part of the background? 

• is the dot part of the data? 

The TFS is broten up into Tag Line Structures (TLS) which specify the contents of every dot position in a 
particular Ime of a tag. Each TLS consists of three tables - A, B and C (see Figure 1 85). 
For a given line of dots, all the tags on that line correspond to the same tag line structure. Consequently for 
a pven luie of output dots, a single tag line structure is recpiired, and not the entire TFS. Double buffering 
allows the next tag line structure to be fetched from the TFS in DRAM while the existing tag Une structure 
IS used to render the cunrent tag line. a«uv,iure 

The TFS interfece is responsible for loading the appropriate Une of the tag format structure as the tag 
encoder advances through the page. It is also responsible for producing table A and table B outputs for two 
consecutive dot positions in the current tag line. 



31 



TE..tfsstanadi 



Ibg Format Stnjcture ) 
Cor tag X 



The number ol dot Ones 

In a Tag « n*1 

Le. TagHoight^ iwl 



T^.tfsendadr 



TLSX_0 



TLSX-l 



TLSX^n 



TLSX^1_0 



TLS X-f 1_1 



TLS X^^1^2 



TLS X'H.n 



Tat)leA 

24x32^itsa768-bits 
(384 entries x 2-bit8} 



T able B 

9x32*blts«288-t3it8 
f32 entriaa it 



23 
24 

32 



f 0 9 10- 31 



Table C 
10-blts 

(2 entries x S-blts) 



22-Mts reserved and unused 



Figure 18S. Breakdown of tho Tag Format Structure 

There is a TLS for every dot line of a tag. 

All tags that are on the same line have the exact same TLS. 

A tag can be up to 384 dots wide, so each of these 384 dots must be speciiied in the TLS. 

The TLS information is stored in DRAM and one TLS must be read in to the TFS Interface for each 

line of dots that are outputted to the Tag Plane Line Buffers* 

Each TLS consists of 17 64-bits words. This is read from DRAM as 5 times 256.bit words with 192 
padded bits in the last 256-bit DRAM read. 



26.8.2 I/O Specification 

Table 133. Tag Fortrat Structure Interfece Port Ust 





m 




pcik 


In 


SoPEC system dock 


prst_n 


In 


Active-low. synchronous reset in pclk domain 
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Table 133. Tag Format Structure Interface Port List 




top_go 



DRAM 



diu,data[63;0] 



diu_tfs_rack 



diu.tf9L.rvalid 



tfs_cllu_rreq 



tfsjdiu_fadr(21:5I 



tag encoder top level 



top_advtagline 



top.tagaltsense 



topjastdotintag 



top^dotposvaJid 



top,tagdotnum[7:0J 



tfsi^vand 



tfeLta^dotO[l:0] 



tfsLla^dot1I1:0] 




Go signal from TE top level 



In 



In 



In 



CXjt 



Out 



Data from DRAM 



Data acknowledge from DRAM 



Data valid from DRAM 
Read request to DRAM 



Read addreas to ORAM 



In 

In" 



In 



In 



In 



Out 



Out 



Out 



Pulsed after ttie last line of a row of tags 



For even tag rows = 0 l.e. 0;2.4.. 
For odd tag rows « 1 l.e. 1.3,5„. 



Ust dot in tag Is currently being processed 



Cwrem dot position fe a tag dot and its structure data and. tag data Is 



Counts from zero up to TE^tagmaxdotpairs (min. «1 , max, « 192) 



TLS tables A, Band C, ready for use 



Even entry from Table A cofresponding to top^tagdot num 
Odd entry from Table A corresponding to top^tagdotnum 



tag encoder top level (PCU read decoder) 



tf8_te_tfsstartadr(23X)) 



tfs_te,tf8endadr(23.D] 



tfs_te_tfsfirstllneadr(23:01 



tf8^te,cuiTtfaadil23.-0] 



TDI 



Out 



Out 



Out 



Out 



JTS tfsstertadr register 



TFS tfsendadr register 



TFS tfsfirstilneadr register 



TFS cunrtfsadr register 



tfsLtdi^adfO{8:0] 



tfei-tdi^adr1(e:0j 



Out 



Read address tor doto (even dot) 



Out 



Read address tor doti (odd dot) 
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26.8.2.1 State machine 



The state machine is responsible for generating control signals for the various TFS table units, and to load 
the appropnate Ime from the TFS. The states are explained below. 

irf/e:- Wait for top^o to become active. Pulse adv_tfsjine for 1 cycle to reset tawradr and tb^^adr regis- 
ters. ?vismgady_trsjme will switch the read/write sense of Table B so switching Table A here as wellto 
keep things the same i.e. wrraO = NOT(wtaO). /x ncre as wen to 

diu access:- In the diu.access state a request is sent to the DIU. Once an ack signal is received Table A 
wnte enable IS asserted and the FSM moves to the tlsjoad state. eivea laoie a 

wS^K-.y*''' ^""^uu' " ' ^^^-^^ "'timately returned by the DIU as 

5*(4*64b.t) words. There wiU be 192 padded bits in the last 256-bit DRAM word. The first 12 64-bit 
words reads are for Table A. words 12 to 15 and some of 16 are for TibleB while pa^ 
lable C. The counter read_num is used to identify which data goes to which table. The table B data is 
stored temporarily m a 288-bit register until the tls_update state hence Awe does not become active until 
rea(|_pum = 16). 

• The DIU data goes directly into Table A (12 • 64). 

• The DIU data for Table Bis loaded into a 288-bit ttgistet 
" The DIU data goes direcdy into Table C. 



tls_update:- The 288-bits in Table B need to written to a 32*9 buffer. The Us.update state takes care of this 
usmg me reaa,jtum counter. 

'^iT^r "^^^ of tfsvalid and switches the read/write senses of Table A (wtoO) 

and Table B a (grcle later (using the advjfsjine pulse). The reason for switching Table A a cycle early is 
to make sure ^ topjevel address via tagdotnum is pointing to the correct buffer. Keep in mind the 
top Jevel IS woitang a cycle ahead of Table A and 2 cycles ahead of Table B. 

If is 1. the state iMchine waits until the advTagLine signal is received. When it is received, the 

iSro?S"T°?itm:t'^i;x.^*°''"^^ 

UtfsValid^ 0 the state machine pulses advTFSLlne (to switch read/write sense in tables A. B. C) and then 
jumps to the tis_tCsvalid_set state where the signal tfsValid is set to I (allowing the tag encoder to start, or 
«~^SSr can then start reading the next line of the TFS ftom 

tis_tfsvalid_next:- Simply sets the tfsvaUd signal and returns the FSM to the diu.access state. 

If an advTagLine signal is received before the next line of the TFS has been read in, tfsValid is cleared to 0 
and processing continues as outlined above. 
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The TFS state flow diagram is shown in below.. 



too advteflllnft— I 









top OO g= 1 


diu_acx)ess 




diu tfs ratsk=-i 






^ tfs^ 


load ^ 




read nun^z^ig 






^ tJs^update ^ 




read nums=3i 






r tls^next j 




tfs valft^ = p 



tls_tfevaliel. 



Lset^ 



Figure 186. TFSI FSM state Plow Diagram 
26.8.3 Generating a tag from Tables A. B and C 
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Table C contains two 5-bit pointers into table B and is followed bv 22 
TLS IS th«efore 34 32-bit woids. loiiowea oy 22 



unused bits. The total length of each 



Each ou^ut dot value is generated as follows- Each enfrv in T,ki * - 

These 2.bits sxc inteipreted according to SSlt Sble ^'^"^ ^'"'^ " ^itl. 



Table 134. InterpretaMon of bltO from entry in Table 




ma output bit comes directly from Mti (we Tabte)? 



data bft will be output 



A when bItO ^ 0 



Tabfe 135. Interpretation of bill from entry In table 



ti: 



output data bit pointed to bycurrwima^^ 



o«i»u« data bltpo^^ed 10 by cunant index Into TaMeaand advance in5Sr^ 



^ — — ^ , , «Mvaiw^ inqex o y i. 

KoiirT^^rsSd^isrs^^^^^ 

Therefore, up to 32 different data bits cSt^p^^a ,ine of S^e^^e^^"'" ^fi^T""' ^PP«— . 
w, be gn,en by the address sto.^ in entr^S of ^ablc B It wf «^ first data dot in a tag 

will advance through the various Table BeJItritt ^""^ "^^^ w« 

StItt^^™t/?;« JLt^td'l?^ - ^^^^ tag. Each 

address decoding, the addresses are iStS^oo iSS S 1^ !, w ^ J ' ' '""^ ''^^ To aid 

9-bit addresses. on the RS encoded tag data. Table lists the interpretation of the 



Table 137. Interpretation of 9-blt tag data address 




In Tabfe B 



mm. 



Select 1 of a codewords. 
Codewords 0. 1 . 2. 3. 4. 5 are variable data, 
eodewords 6, 7 are fixed data. 
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Table 137. Interpretation of 9-blt tag data address In Table B 



S/mbolSelect 



BitSelect 



Select 1 of 15 symbols <1 ill Invalid) 



Select 1 of 4 bits from the selected symbols 



HW^^Zrl!! ^!!'^ '.!!!!!!^'^'!! '°'^ redundancy encoding 









0-19 


0-4 


6 


20-39 


(M 


7 


40-59 


5-9 


6 


60-79 


&9 


7 


80-99 


10-14 


6 


100-119 


10-14 


7 . 



» uuponani co aote mat tbe mtcrpretation of bitl from Table A rvi^en bitO - n .« a « u * • ^ 

IS used to cycle through the data address in Table B «!,nr.. a1 fiL^T • , ^ 5-bit index 

the tag. so can u^SSlnti!^^ 5 tags start at the left most dots position within 
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26.8.4 Architecture 

A block diagram of the Tag Format Structure Interface can be seen In Figure 187. 




#-taOdd 
^taEven 



dotsRosVaUd 



etdRdAdfO 



etdRdAdrl 



Figure 187. TFS Block Diagram 
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26.8.4.1 Table A interface 



BdvTFSLIne. 
tawg_ 



taRdAdr 



4- 



AdrGen 



tfatain 



64 



Table A 
Interface 




dsAatn 



16x64-bits 
tabic A (0) 



jtetafn 



16x64-bits 
table A (1) 



MH2 



is_doiO_lcyclelater 
Q.dotl.lcyclebtcr 



2;blts1«» laEven 

7^ ■ ► 



2^bteMa)__taO<W 



Figure 188. Table A interface block diagram 

J2^'of?ab? A%t'^,K° ^f^'^. = 0) ^'^ be passed to the topjcvel 2 cycles after the 

J^ce SI^t^i^« • i'^'f ''r'^"* " "5 registering Table A and T^le B outputs 
hence this extra registenng stage for the generation of t!L.dotO_lcyclelater and tsL.dotl_lcyclelater. 

Each time an AdvJTSLine pulse is received, the sense of wWch RAM is being read fmm or written to 

changes. This ,s accomplished by a 1 -bit flag called ^0. Although the initial 4e of Z^oZ 

umust mvm upon receipt of an AdvTFSLine pulse. A 4-bit counter called u,WrAdr keeps the write 

■n« tawe (table A wnte enable) mput is set whenever the data in is to be written to table A. The tafVrAdr 



advTFSLloe 



tawo . 



wrtaO 



11 



table A 
address gen 



4^ 



taWrAdr 
(4 WIS) 



wilaO 



taWrAdr 



Figure 189. Table A address generator 
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26,8.4.2 Tabfe C interface 

A block diagram of the table C interface is shown below in Figure 190. 



lagAltSensa 



advTFSUne 




Figure 190. Table C interface block diagram 



Table 139. AdrCen lookup table 
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1. X s dont care state. 



26.8,4.3 Table B interface 

The table B interface implementation generates two encoded tag data addresses (tfsi adrO tTn «rf 7-, 
F^:i9T ^ '"^"^ ""^^ '*^''^^>- ^ "-•^ diagrToTtit-;^^' bf s;^^ 



tbR dAdfO. 



■4- 



tbR dAdrli 



tbwe 



advTFSUnei 



read_num ' 



(from TPS FpM) 



datain > 



pcfk 



64 

-7^ 



tbwradf 



AdrGen 



tsbwe 



I 



286-brt 
tableB 



ls>-adrl 



dalaln. 



32x9-t]lts 
table subB(0) 



\X 



J^datarn^ 



table subB (1) 



LadiO 



tabteB 
lAtefface 



Figure 191. Table B Interface block diagram 

Each timean AdvTFSLine pulse is received, the sense of which sub buffer is beina read from or wri»t.n f« 
' Note:- The output addresses finom Table B are registered. 
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27 Tag FIFO Unit (TFU) 

27.1 Overview 

S^^? R °^K^^ '""'^ ^'^"'^ ''^^ transfentJd between the Tag Encoder (JE) 

Zt^ y abstia^og the bufFedng mechanism and controls from both units, the iiJerface is cle^ 

between the data user and the data generator. ' av« «» uean 

,^l^.inH„?''^",^io^ ^'^^^ "^"S Encoder will provide support for arbitrary Y 

^^hfr^ni'* /''i'*'".- ^ "^"^ °f dot data is performed at Ac output of theHFO 
m the TFU. There is feedback to the TE from the TFU to allow stauL of the TE duringa line The TE 
mterfeces to the TFU with a data width of 8 bits. The TFU interfaces to L HcSl^TZa 
The depth of the TFU FIFO is chosen as 16 bytes so that the HFO can store a single 126 dot tag. 

27.1 .1 Interfeces between TE, TFU and HCU 



TFU 



TE 



te_tfu_wdata 




tfaj^te^oktow 
4 


1 

to 


ta_tfu.wradv 


ine 

1 



FIFO 





_tdata . ^ 





.avail 



HCU 



Figure 192. Interfaces between TE, TFU and HCU 

27.1.1.1 TE-TFU Interface 

The interface from the TE to the TFU comprises the following signals- 

• tejtfii_wdata^ 8-bit write data. 

• ^c_tfu_wdatavalid^ write data valid. 

• te_tfu_wradvline, accompanies the last valid 8-bit write data in a line. 
The interface from the TFU to TE comprises the following signal: 

• tfii_te^oktowrite, indicating to the TE that there is space available in the TFU FrFO. 

"^rur.^^ ^ ^^^^ ^ '^'^^ ^ "^'^ rft^^te^okto^ie output bit is set. TTie TE write 
will not occur unless data is accompanied by a data valid signal. 



27.1.1.2 TFU-HCU Interface 

[iterface fr( 
tfu_hcu_j 

tfujtcu^avail, data valid signal indicating that there is data available in the TFU FIFO. 



The interface from the TFU to the HCU comprises the following signals- 
♦ tfii_hcu_tdata, 1 -bit data. 
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The interface from HCU to TFU comprises the following signal- 
• hcujfujread)^, indicating to the TFU to supply the next dot. 



27.1.1.2.1 X scaling 



To account for the case where there mav be two SnPPr H/.*^^-.^ u 

line, the first dot in a line may no" S^^nH^^d STe to^ T' f ' Portion of a dot- 

TFU. TT,e dot will ultimately SS-S^c^,f JSt oT? *^^^°7^ ''^'^'^^ 

lead-out and the other on iJlcLT ^ * '^'^^ ««^^8. one on its 

any dots in the last byte that do not apply to^rpriS^" Slft^ riSpTc':^,!'"''' 
J^ottcct number of dots into the tag and its Output wil, be%^'S:itS'SS^er^e^7S:^^ 
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27.2 Dehnitions of I/O 

Table 140. TFU PortUst 









vna neseis — ^ 


pdk 


1 


tn 


1 SoPEC Functional dock. 


prst_n 


1 


In 


1 Global reset fiignal. 


PCU [ateitace data and control signals 


pcu_addr(3:2] 


2 


In 


PCU address bus. Only 2 bits are required to decode the 
address space for this block. 


pcu_dataout(3l:0] 


32 


In 


Shared write data bus from the PCU. 


tfu_pcu_datainI31 :0J 


32 


Out 


Read data txjs from the TFU to the PCU. 


pcu_rwn 


1 


In 


Common read/not-write signal from the PCU. 


pcu^tfu.sef 


1 


In 


Block select from the PCU. When p«/_fftcso/is high both 
pcu addrandpcu da&aouf are valid. 


tfu_pcxi_rdy 


1 


Out 


Ready signal to the PCU. When tfu_pcu_fxfyi9 high It Indi- 
cates the last cyde of the access. For a write cyde this 
means pcc/^dafaoufhas been registered by the block and 
for a read cyde this means the data on tfujmuCfatain is 

var»d. 


TE Interface data and control signals 


te_tlij_wdata(7K)] 


8 


In 


Write data for TFU FIFO. 


te.tfu.wdatavaOd 


1 


In 


Write data valid signal. 


td.du.wradvtine 


1 


In 


Advance Une signal strobed when the last byte In a fine is 
placed on te_tfu^wdata 


tfu_te_oktowrite 


1 


Out 


Ready signal indicating TFU has space available in ITs FIFO 
and is ready to be written to. 


HCU Interface data and control signals 


heu_tfu.advdot 


1 


In 


Signal Indicating to the TFU that the HCU is ready to accept 
the next dot of data from TFU. 


tfti_hco_ldata 


1 


Out 


Data from the TFU FIFO. 


tfu_hcu^avan. 


1 


Out 


Signal Indicating vafid data available from TFU RFO. 
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Configuration Registers 

Table 141. Tf^ Configuration Registers 




Control registers 



0X00 


Reset 


1 


1 


A write to this register causes a reset of 










the SFU. 










This register can be read to lndk:ate the 










reset state: 










0 - reset in progress 










1 - reset not in progress. 
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Table 141. TFU Configuration Registers 



0x04 



Go 



see 
text 



Setup registers (constant during processi ng of page) 

8 



Wrfting 1 to this register starts the TFU- 
Writing 0 to this register hatts the TFU. 
When Go is deasserted the state- 
machines go to their idie states txjt all 
counters and configuration registers keep 
their vaiues. 

When Go is asserted all counters are 
reset, but configuration registers Iceep 
their vaiues O-e. they dont get reseQ. 
The TRJ must be started before the TE is 
started. 

This register can be read to detennine if 

the TFU is running 

(1 - running, 0 = stopped). 



0x08 



0x10 



0x14 



XScafe 



XFracScaie 



TEByteCount 



HCUDotCount 



1 



12 



15 



Tag scale factor In X direction. 



Tag scale factor in X direction for the first 
dot in a line 



The number of bytes to be accepted fiom 
the TE per line. Once this number of bytes 
have been received subsequent bytes are 
ignored untii there is a strot^e on the 
to_tfu_wmdvi{nQ 



The numt>er of (optionaJiy) x-scaled dots 
per line to be suppfled to the i^CU. Once 
this number has been reached the remain- 
der of the current FIFO byte is ignored. 



27.4 Detailed description 



is a on ''S'^'^r "^'^ 'f ^ ^ ignored untU the« 

IS a strobe on the te_«^_»m«d^/,„e signal, whereupon bytes for the next line are 

^ *L"tS;.f F,pn ? °f '^^^^r?""' ""'^ P""*"^ ^ Once this count is reached any 
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The behaviour of these signals and the control signals between the TFU and the TE and HCU is detailed 
below. 



FHbWrPtr 



te.Mj.daCa 




V — ► tftj.hcu.tdata 



-FHoRdPtr 

Figure 193. 16-byt6 FIFO In TFU 

// Concurrently Executed Code: 

// ^.'^"7^ allowed to write when there's either (a) room or (b) no room and all 

// bytes tor that line have been received. 

if ((FifoCntnts |« FifoMax) OR (PifoCncnta FifoKax and ByteToRx 0>) then 

tfu_te.oktovnrite » 1 ^ i » 

else 

tfu_t€_oktowrite = 0 

// Data presented to HCU when there is Ca) data in FIFO and (b) the HCU has not 

// received all dots for a line 

if (FifoCntnts !» 0) AND (BitToTx 1= 0)then 

tfu_hcu_avail « 1 
else 

tfu_hcu_avail = 0 

// Output mux of FIFO data 
tfu^hcu_tdata « Fifo CFifoRdPnt J [RdBit] 

// Seczuentially Executed Code: 

^!^-^^V-*'^«^«valid 1) AKD (FifoCntnts »= FifoMax) AND (ByteToRx 1= 0) then 

Fi£o(FlfoWrPnt] « te_tfu_wdata 

FifoWrPnt 

FifoContents 

ByteToRx — 

if {te„tfu_%fradvline 1) then 
ByteToRx = TEByteCount 



if (hcu_tfu_advdot 
BitToTx ♦+ 
if (RepFrac == I) then 
RepFrac = Xscale 
if (RdBit = 7) then 
RdBit = 0 
FifoRdPnt 
FifoContents — 
else 

RdBit^i- 

else 

RepFrac- - 
if (BitToTx == 1) then ( 

RepFrac = XFracScale 

RdBit = 0 

FifoRdPnt 

Fif oContents-- 

BitToTx = HCUDotCount 

) 



1 and FifoCntnts != 0) then ( 
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i3 



What is not detailed above is the fact that, sinw this is a circular buffer hothf^^ a ^ ■ 

ers wrap-around to zero after they reach two. Also not dS^r^isSt Lfff ?h wnte-point- 

thereadandwrite-pointeri„thesan.ecycle.thefifoc:„'ru^^^ 
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28 Halftoner Compositor Unit (HCU) 



28.1 Overview 



™ f f r ^""P??""' V"* <«CU) produces dots for each nozzle in the destination printhead takin.. 
account of the page dnneastons fmcluding maigins). The spot data and tag data are Sd ifb.^el 
fonn while the p«el contone data received from the CFU must be dithered to a bi-ieve! reS^enion ?Li 

pm dot at a tune (6 bits) to the next stage in the printing pipeline, namel^thc dead ^ozzFe" mp'^^^r 



28.2 Data flow 



Figure 194 shows a simple dot data flow high level block diaeiam of the Hrii ti.- una ^ 

data from the CFU. bi-level spot data from £ SFU^^ b^i^g <^ ?r^m tS^rJS nirh"? "^T^ 

are read from the DRAM via the DIU. The calculated output .^s «aS^by the^NC 



contono RFO 
unit tntefface 



ORAM 
Interfece unit 



4- 
4- 




CCXttfOl 

— radL 
















data 





spot 
FIFO unit 
Inteft^K^a 



tag 
RFO unit 
interface 



Halftoner / Composftor Unit 



dead 
no2zfa 
compensator 



Figure 194. High level block diagram showing the HCU and its external interfaces 

"l^"*^ P***" dimensions (including margins), and is only started once for the page It does 
not need to_be programmed in between bands or restarted for each band The HCl?^ iS Croi,riateW 
lon» IS IT't of P^ge Ae HCU will continue to pr^uct^fTrdot^^^ 

The HCU performs a linear processing of dots calculating the 6-bit output of a dot in each cvrfe The ma,^ 
the spotO layer over the appropriate contone layer (typically black), the merg^of CMY inS K Jif K 
Entfa^vToXIJbt:^;'* 
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28.3 DRAM STORAGE requirements 




dUhe?m^? ^"^^^ bandwidths requirements for some of the possible configurations of the 

• 4 Kbyte DRAM storage required for one 64x64 (preferred) byte dither matrix 

• 6.25 Kbyte DRAM storage required for one 80x80 byte dither matrix 

• 16 Kbyte DRAM storage required for fom- 64x64 byte dither matrices 

• 64 Kbyte DRAM storage required for one 256x256 byte dither matrix 

Note that regardless of the width of the dither matrix. 256 bytes are always read from DRAM for each line. 
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28.4 Implementation 

A block diagram of the HCU is given in Figure 195. 
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Figure 195. Block diagram of the HCU 
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28.4.1 Definition of I/O 



Table 142, HCU port Ilsl and description 




System ctock. 



System reset, synchronous active low. 



Block select from the PCU When pcu^hcu^sel is high both 
pci<.atfrand pcu_dataout are valid. 



Common feadMot-write signal from the PCU. 



PCU address txjs. Only 6 bits are required to decode the 
address space for this block. 



Shared write data bus from the PCU. 



^^J^ ^-^^/t^is high it Indicates 

the last cyde of the access. For a write cyde this means 
pa/^ctejaouf has been registered by the block and for a read 
cyde this means the data on hcu jxxi_dalB is valid 



Read data bus to the PCU. 



HCU read request, active high. A read request must be acoom- 
panied t)y a vaKd read address. 



Acknowledge from OIU, active high. Indicates that a read 
request has been accepted and the new read addieas can be 
placed on the address bus, hcu_<eu^mdr. 



HCU read address. 17 bits wide (256-btt aligned word). 



Read data valid, active high. Indicates that valkl read data Is 
now on the read data bus, diu^data. 



Read data from DJU. 



Indfcates valid data present on cfu_hcu_c(3-03data linea 



Pixel of data in oontone plane 0, 



Pixel of data in contone plane 1. 



Pixel of data in contone plane 2. 



Pixel of data in contone plane 3. 



Infomis the CFU that the HCU has captured the pixel data on 
cfu_hcu^c(3-CJdata lines and the CFU can now place the next 
pixel onthedata lines. 



Indicates valid data present on sfu_hcu_sdata. 



61-level dot data. 



Infonns the SFU that the HCU has captured the dOt data on 
sfu_hcu^sdata and the SFU can now place the next dot on the 
data line. 



jndtcates valid data present on ffu _hcu^(data. 
Tag dot data. 
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Table 142. HCU port list and description 











hcu_tfu_advdot 


1 


Out 


Jnfornis the TFU that the HCU has captured the dot data on 
tfu_hGu_taata and the TFU can now place the next dot on the 
data Hne. 


DNC Interface 


dnc_hcu_ready 


1 


In 


Indicates that DNC is ready to accept data from the HCU. 


hcu.dnc^avail 


1 


Out 


Indicates valid data present on /}Ou.dSrxLd^(a. 


hcu_dnc_data[S:0] 


6 


Out 


Output bi-level dot data In 6 ink planes. 



28.4^ Configuration Registers 

The configuration registers in the HCU are programmed via the PCU interface. Refer to section 21 .8.2 on 
page 257 for the description of the protocol and timing diagrams for reading and writing registers in the 
HCU. Note that since addresses in SoPEC are byte aligned and die PCU only supports 32-bit register reads 
and writes, the lower 2 bits of the PCU address bus are not required to decode the address space for the 
HCU. When reading a register that is less than 32 bits wide zeros should be returned on the upper unused 
bit(s) ofhcu^pcujlata. The configuration registers of the HCU are listed in Table 143. 



Table 143. HCU Registers 









mi 




Contnol registers 


0x00 


Reset 


1 


0x1 


A write to this register causes a reset of the HCU. 


0x04 


Go 


1 


0x0 


Writing 1 to this register starts the HCU. Writing 0 to 
this register halts the HCU. 
When Go is asserted all counters, flags etc. are 
cleared or given their initial value, tnit configuration 
registers keep their values. 

When Go Is deasserted the state-nnachines go to their 
idle states but all counters and configuration registers 
keep their values. 

The HCU should be started a^erthe CFU, SFU. TFU. 
and DNC. 

This register can be read to determine If the HCU Is 
running 

(1 ~ mnnlng. 0 = stopped). 


Setup registers (constant for during processing) 


0x10 


AvailMask 


4 


0x0 


Mask used to detemnine which of the dotgen units etc. 
are to be checked before a dot is generated by the 
HCU within the specified margins for Ihe specified 
cok>r plane. If the specified dotgen unft is stalled, then 
the HCU will also stall. 

See Table 1 44 tor bit altocation end definltkNi. 


0x14 


TMMask 


4 


0x0 


Same as AvailMask. but used In the top margin area 
before the appropriate target page is reached. 


0x18 


PageMarginy 


32 


Ox0000_ 
0000 


The first Hne considered to be off the page. 


OxIC 


MaxDot 


16 


0x0000 


TTils Is the maximum dot number • 1 present across a 
page. Fdr example if a page contains 13824 dots, 
then MaxDot m\\ be 13823. 
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Table 143. HCU Registers 



















uXuOOO 
0000 


The first line on a page to be considered within the 
torget page for contone and spot data. (0 =» firet 
printed line of page) 


0x24 


BottomMargin 


32 


0x0000^ 
0000 


The first line in the target bottom margin for contone 
and sDot data fi.e. first One after taroAf nflnA^ 


0x28 


LeftMargin 


16 


0x0000 


The first dot on a line within the target page for con- 
tone and spot data. 


0x2C 


RightMargin 


16 


OxFFFF 


The first dot on a ilne within the target right nwgin for 
contone and spot data. 


0x30 


TagTopMargin 


32 


0x0000 
0000 


The first line on a page to be considered within the 
target page for tag data. (0 = first printed line of page) 


0x34 


TagBottomMaroin 


32 


0x0000. 
0000 


The first line In the target bottom margin for tag data 
(!.e. first One after target page). 


0x38 


TagLeftMargin 


16 


0x0000 


The first dot on a line within the target page for tag 
data. 


OxSC 


TagRightMargIn 


16 


OxFFFF 


The first dot on a line within the target right margin for 
tag data. 


0x40 


DMReadEnable 


1 


0x0 


1 1f a dither matrix is specified 
0 if a dither matrix Is not specified. 


0x44 


StartDMAdr 


17 


OxO_ 
0000 


Points to the first 256-bit word of the first line of the 
dither matrix In DRAM. 


0x46 


EndDMAdr 


17 


0x0_ 
0000 


Points to the last 256-bit word of the last fine of the 
dither matrix In ORAM. 


0x4C 


Unelncrement 


5 


0x2 


The number of 256-bit words In DRAM from the start 
of one line of the dither matrix and the start of the next 
line. I.e. the vafue by which the DRAM address Is 
Incremented at the start of a line so that it points to the 
start of the next line of the dither matrix. 


0x50 


DMInhlndexCO 


8 


0X00 


Initial Index within 256-byte dither matrix Una buffer for 
contone plane 0. If using doubfe-buffor scheme, only 

tho 7 IsHa ArA itRMi 


0x54 


DMLwrindexCO 


8 


0x00 


Lower Index within 256-byte dither matrix ilne butter 
for contone plane 0. If using doubie-buffer scheme, 

onty the 7 Isbs are used. 


0x58 


DMUprlndexCO 


8 


0x3F 


Upper Index within 256-byte cBther matrix line buffer 
for contone plane 0. After reading the data at this 
locatton the index wraps to DMLwrindexCO, If using 
double-buffer scheme, onty the 7 Isbs are used. 


Ox5C 


DMInitlndexCI 


8 


0x00 


Initial Index within 256-byte dither matrix Une buffer tor 
contone plane 1. If using double-buffer scheme, only 
the 7 Isbs are used. 


0x60 


DMLwrlndexCl 


8 


0x00 


Lower index within 256-byte dither matrix line buffer 
for contone plane 1. If using double-buffer scheme, 
only the 7 isbs are used. 


0x64 


DMUprlndexOI 


8 


0x3F 


Upper index within 256-byte dither matrix line buffer 
for contone plane 1 . After reading the data at this 
tocatfon the Index wraps to DMLwrlndexCl. If using 
double-buffer scheme, only the 7 Isbs are used. 
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Table 143. HCU Registers 











mmmmmmm 




DMlnitlndexC2 


8 


0x00 


Initial index within 256-byte dither matrix line buffer fer 
oofitone plana 2. If using double-buffer scheme, only 
the 7 Isbs are used. 


0x6C 


0MLwrtrtdexC2 


8 


0x00 


Lower Index within 256-t>yte dither matrix Dne buffer 
for contone plane 2. If using double-buffer scheme, 
only the 7 Isbs are used. 


0x70 


DMUprindexC2 


8 


Ox3F 


Upper index within 256-byte dither matrix line buffer 
for contone plane 2. After reading the data at this 
location the index wraps to DML¥/r(ndexC2. If using 
doubte-buffer scheme, only the 7 Isbs are used. 


0x74 


OMInltlndexCa 


8 


0x00 


Initial index within 256-byle dither matrix line buffer for 
contone plane 3. If using double-buffer scheme, only 
the 7 Isbs are used. 


0x78 . 


DMLwrlndexCS 


8 


0x00 


Lower index within 256-byte dither matrix line buffer 
for contone plane 3. If using double-buffer scheme, 
only the 7 Isbs are used. 


0x7C 


DMUprindexC3 


8 


0x3F 


Upper Index within 256-byte dither matrix line buffer 
for contone plane 3. After reading the data at this 
location the Index wraps to DMLwrfndexC3. If using 
double-buffer scheme, only the 7 Isbs are used. 


0x80 


DoubleLineBuf 


1 


0x1 


Selects the dither tine buffer nK>de to be single or dou- 
bfe buffer. 


0x84 to 0x98 


lOMappingLo 


6x32 


0x0000^ 
0000 


The dot reorg mapping for output Inks 0 to 5. For each 
ink's 64-bit iOMapping value. iOMapplngLo repre- 
sents the low order 32 bits. 


Ox9CtoOxBO 


lOMappingHi 


6x32 


0000 


The dot reorg mapping for output inks 0 to 5. For each 
ink's 64-bit IOMapping value. lOMappingHI represents 
the high order 32 bits. 


0xB4 toOxCO 


cpConstant 


4x8 


0x00 


The constant contone value to output for contone 
plane N when printing in the margin areas of the page. 
This value will typically be 0. 


0xC4 


sConstant 


1 


0x0 


The constant bl-level value to output for spot when 
printing in the margin areas of the page. This value 
•wiil typically be 0. 


OxC8 


iConstant 


1 


0x0 


The constant bi-level value to output for tag data when 
printing In the margin areas of the page. This value 
wai typically be 0. 


OxCC 


OithorConstant 


8 


OxFF 


The constant vaJuo to use for dither matrix when the 
dither matrix is not available. i.e. when the signal 
dm_avaais 0. This vaiue wiD typically be OxFF so that 
cpConstant can easily be 0x00 or OxFF without requir- 
ing a dither matrix {DitherConstant Is primarily used 
forthreshoki dithering in the margin areas). 


Debug registers (read only) 
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Table 143. HCU Registers 







1^ 






1 WfUU 






Kl/A 


Bit 1 3 s tfu_hou^avBil 
Bit 1 2 = hcu^tnZacMot 

Bit 10 = hcujsfujBLfMot 
Brt 9 s Cfu_hcu_avsUI 
Bit 8 = hcu_cfu_ad¥dot 
Bit 7 e dncjttcujtB^dy 
Bit 6 B houueSnc^avaH 
Bits5«0 = hcu dnc data 


QxD4 


HcuDotgenOebug 


15 


N/A 


Bit 14 = afterjtop_margln 
Bit 13 = in_tsigjtarget_j>age 
Bit 12 e fn_targetpage 
Bit 11 « p_avail 
BitlOe s_ava// 
Bit 9 = cp_ava// 
Bit 8 = drri^avait 
Bit 7 e advdot 

Bits 5-0 = tfl5,s,cp3.cp^cpf,cp0] 

(i.e. 6 t3it input to dot reorg units) 


OxD8 


HcuDitherOetNigl 


17 


N/A 


Bit 9 = advdot 

Bit 8 a dm^avait 

Bit 15-8 = cp1jdither_val 

Bits 7-0 = cpO_dither_val 


OxDC 


HcuOitherOebug2 


17 


N/A 


Bit 9 s advdbt 

Bitdadiriuavajr 

Bit 15^ = cp3^dither_val 

Bits 7-0 s cp2^dither vail 



28.4.3 Control unit 



The control unit is responsible for controlling the overall flow of the HCU. It is responsible for detcimin- 
rng whether or not a dot will be generated in a given cycle, and what dot will actually be generated - 
including whether or not the dot is in a margin area» and what dither cell values should be used at the spe- 
cific dot location. A block diagram of the control unit is shown in Figure 196. 

The inputs to the control unit are a number of avail flags specifying whether or not a given dotgen unit is 
capable of supplying 'real' data in this cycle. The term 'real' refers to data generated from external 
. sources, such as contone line buffers, bi-Ievel line buffers, and tag plane buffers. Each dotgen unit informs 
the control unit whether or not a dot can be generated this cycle from real data. It must also check that the 
ONC is ready to receive data. 

The contone/spot margin unit is responsible for determining whether the current dot coordinate is within 
Uie target contone/spot maigins, and the tag maigin unit is responsible for determining whether the current 
dot coordinate is within the taiget tag margins. 

The dither matrix table interface provides the interface to DRAM for the generation of dither cell values 
that are used in the halftoning process in the contone dotgen unit 
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Figure 196. Block diagram of the control unit 
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28.4,3.1 Determine AdvDot 



The HCU does not always require contone planes, bi-levcl or tag planes in order to produce a page. For 
example, a given page may not have a bi-level layer, or a tag layer. In addition, the contone and bi-level 
parts of a page are only required within the contone and bi-level page margins, and the tag part of a page is 
only required within the tag page margins. Thus output dots caii be generated without contone. bi-level or 
tag data before the respective top margins of a page has been reached, and Os are generated for all color 
planes after the end of the page has been reached (to allow later stages of the printing pipeline to flush). 

Consequently the HCU has an AvailMask register that determines which of the various input avail flags 
should be taken notice of during the production of a page from the first line of the target page, and a 
TMMask register that has the same behaviour, but is used in the lines before the target page has been 
reached (i.e. inside the target top margin area). Each bit in the AvailMask refers to a particular avail bit: if 
the bit in the AvailMask register is set, then the corresponding avail bit must be 1 for the HCU to advance 
a dot. The bit to avail correspondence is shown in Table 144. Care should be taken with TMMask - if the 
particular data is not available after the top margin has been reached, then the HCU will stall. Note that the 
avail bits for contone and spot colors are ANDed with injtarget ^age after the tai^et page area has been 
reached to allow dot production in the contone/spot margin areas without needing any data in the CFU and 
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SFU. The avail bit for tag color is ANDcd with in^tag^target^age after the target tag page area has been 
reached to allow dot production in the tag margin areas without needing any data in the TFU. 



Table 144. Correspondence between bit In AvailMask and avafl flag 











0 


dm_avaU 


dilher matrix data available 


1 


cp_avall 


contone pixels available 


2 


s^avafl 


spot color available 


3 


tp.avail 


tag plane available 



Each of the input avail bits is processed with its appropriate mask bit and the after jtop_margin flag. The 
output bits are ANDed together along writh Go and okjto^write (which specifies whether the output buffer 
is ready to receive a dot in this cycle) to form the output bit advdoL We also generate wr_fuhdot. In 4is 
way, if the output buffer is full or any of the specified avail flags is clear, the HCU will stall. When the end 
of the page is reached, injffage will be deasserted and the HCU wiU continue to produce 0 for aU dots as 
I long as the DNC requests data. A block diagram of the determine advdot unit is shown in Figure 197. 

The okjo^read signal from the output buffer indicates that the HCU has a dot available for the DNC to 
read (indicated to the DNC by the assertion of hcuJLnc^avail). If the DNC is ready to receive the dot 
{dncjicujready is 1) then the dot is read from the output buffer by asserting rd_advdoL 
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Figure 197. Block diagram of determine advdot unit 



28.4.3.2 Position unit 



The position unit is responsible for outputting the position of the current dot {curr _pos, currjtme) and 
whether or not this dot is the last dot of a line (advline). Both curr ^os and currjtine are set to 0 at reset or 
when Go transitions from 0 to 1 . The position unit relies on the €uMot input signal to advance through the 
dots on a page. Whenever an advdot pulse is received, currjjos gets incremented. If curr jjos equals 
max^dot then an advline pulse is generated as diis is the last dot in a line, currjinc gets inciemented, and 
the curr^os is reset to 0 to start counting the dots for the next line. 

28,4.3.3 Margin unit 

4 

The responsibility of the margin unit is to determine whether the specific dot coordinate is within the page 
at all, within the target page or in a margin area (see Figure 1 98). This unit is instantiated for both the con- 
tone/spot maigin unit and the tag maigin unit. 
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target top margin 



target bottom margin 



^ target page 

- pflntabie page area 
(physical page) 



Figure 198. Pago structure 



The maigin unit takes the current dot and line position, and returns three flags. 

• the first, in^age is 1 if the current dot is within the page, and 0 if it is outside the page, 

• the second flag, in^target^age, is 1 if the dot coordinate is within the target page area of the page, and 
0 if it is within the tai^get top/lcft/bottom/right margins, 

• the third flag, after_top_maigin, is 1 if the current dot is below the target top margin, and 0 if it is 
within the target top margin. 

A block diagram of the margin unit is shown in Figure 199. 
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currjBne 
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Figure 199. Block diagram of margin unit 
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28.4.3.4 Dither matrix table interface 

The dither matrix table interface provides the interface to DRAM for the generation of dither cell values 
that are used in the halftoning process in the contone dotgen unit The control flag dm readjsnable 
enables the reading of the dither matrix table Hne structure from DRAM, If dm_read_erZble is 0 the 
dither matrix is not specified in DRAM and no DRAM accesses arc attempted The dither matrix table 
interface has an output flag dm^avail which specifies if the current line of the specified matrix is available. 
The HCU can be directed to stall when dmjavail is 0 by setting the appropriate bit in the HCU's Avail- 
Mask or TMMask registers. When dm_avaii is 0 the value in the DitherConstant register is used as the 
dither cell values that are output to the contone dotgen unit. 

The dither matrix table interface consists of a state machine that interfaces to the DRAM interface, a dither 
matrix buffer that provides dither matrix values, and a unit to generate the addresses for reading the bufifcr. 
Figure 200 shows a block diagram of the dither matrix table interface. 
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Figure 200. Block diagram of dither matrix table Interface 

28.4.3.4.1 Dither matrix buffer 

The state machine loads dither matrix table data a line at a time from DRAM and stores it in a buffer. A 
single line of the dither matrix is either 256 or 128 8-bit entries, depending on the programmable bit Dou- 
bleLmeBuf. If this bit is enabled, a double-buffer mechanism is employed such that while one buffer is 
read from for the current line's dither matrix data (8 bits representing a single dither matrix entry), the 
other buffer is being written to with the next line's dither matrix data (64-bits at a time). Alternatively, the 
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single buffer scheme can be used, where the data must be loaded at the end of the line, thus incurring a 
delay. 

The single/double buffer is implemented using a 256 byte 3-poit register SLmy, two reads, one write port, 
with the reads clocked at double the system clock rate (320MH2) allowing 4 reads per clock cycle. 
The dither matrix buffer unit also provides the mechanism for keeping track of the current read and write 
buffers, and providing the mechanism such that a buffer cannot be read from until it has been written to. In 
this case, each buffer is a Ime of the dither matrix, i.e. 256 or 128 bytes, 

A bit is kept for the status of each dither matrix line buffer: buff_a\fail[0] and buff_avail[l]. It also keeps a 
single bit (rdjmff) for the current buffer that reads are to occur from, and a single bit {wrjbuff) for the cur- 
rent buffer that writes are to occur to. The output value dm_javail equals buff_avail[rdjbuff]. The output 
value okjto^write equals tmff-ovailfwrjmffj. Note that when using a single line buffer, buffiavailfJJ is 
not used. 

The read addresses are byte aligned. A single dither matrix entry is represented by 8 bits and an entry is 
read for each of the four contone planes in parallel. When a advline pulse is received, buff_(avail [rdjmff] 
is cleared, and rdjbuff 'ys inverted (if using a double line buffer). 

Data is written, 64 bits at a time to the current write buffer when diujicu^rvalid is asserted When WrAdr 
is 0x1 F and diujxcu_rvalid is 1, buff_avaU[MfrJn4ff] is set, and wrjbuff is inverted (if using a double line 
buffer). This indicates that a line of dither matrix has been written to the cxurent write buffer and it is now 
available to be read. 

28.4.3.4.2 Read address generator 

For each contone plane there is a initial, lower and upper index to be used when reading dither cell values 
from the dither matrix double buffer. The read address for each plane is used to select a byte from the cur- 
rent 256-byte read buffer. When Go gets set (0 to 1 transition), or at the end of a line, the read addresses 
are set to their corresponding initial index. Otherwise, the read address generator relies on advdot to 
advance the addresses within the inclusive range specified the lower and upper indices, represented by the 
following pseudocode: 

i£ (Bdvdot B« 1) then 

if (advline «== 1) then 

rd»adr » din_init«index 
elBif (r4l.adr == dnL.upr_index) then 

rd^adr = dnulwiJ'^index 
else ' 

rd^adr 

else 

r4.adr = rd_«idr 

28.4.3.4.3 State machine 

The dither matrix is read from DRAM in single 256-bit accesses, receiving the data from the DIU over 4 
clock cycles (64-bits per cyc!e),The protocol and timing for read accesses to DRAM is described in sec- 
tion 20.9.1 on page 208. Read accesses to DRAM are implemented by means of the state machine 
described in Figure 201. 

All counters and flags should be cleared after reset or when Go transitions from 0 to 1 . While the Go bit is 
1 , the state machine relies on the dm_read_enable bit to tell it whether to attempt to read dither matrix data 
from DRAM. When dm^readjenable is clear, the state machine does nothing and remains in the idle state. 
When dm_read_enable is set, the state machine continues to load dither matrix data, 256-bits at a time 
(received over 4 clock cycles. 64 bits per cycle), while there is space available in the dither matrix buffer. 
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The read address and line^tart_adr are initially set to siart_jim_adr The read address gets incremented 
after each read access. It takes 4 or 8 read accesses to load a line of dither matrix into the dither matrix 
buffer, depending on whether we're using a single or double buffer. A count is kept of the accesses to 
DRAM. When a read access completes and access jcount equals 3 or 7, a line of dither matrix has just 
been loaded from and the read address is updated to line^tart^adr plus linejncrement so it points to the 
start of the next Ime of dither matrix. Qine^tart^adr is also updated to this value). If the read address 
equals end_dm^Qdr then the next read address will be start_dm^adr, thus the read address wraps to point 
to the start of the area in DRAM where the dither matrix is stored. 

The write, address for the dither matrix buffer is implemented by means of a moduIo-32 counter that is ini- 
tially set to 0 and incremented when diujicu^rvalid is asserted. 
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Figure 201. State machine to read dither matrix table 
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28.4.4 Contone dotgen unit 



The coatone dotgen unit is responsible for producing a dot in up to 4 color planes per cycle. The contone 
dotgen unit also produces a cpjavail flag which specifics whether or not contone pixels are currently avail- 
able, and the output hcu_cJu_advdot to request the CFU to provide the next contone pixel in up to 4 color 
planes. 

The block diagram for the contone dotgen iinit is shown in Figure 202. 



c 

S 

XL 

S 

c 

8 



hcu_cfu_advdot 



cfu_hcu,o0fl9ta 



cfu_hcu_c1data 



cfu_hcu_c2data 



cfu_hcu_c3data 



cfu_hcu.avaii 



contone dotgen unit 



' 8 
■7^ 



cp0_oofi8tant ^ 



dither unit 0 



CP1 



dither unit 1 



cp2_constanl ^ ^ 



dither unit 2 



cp3_constam ^ ^ 
► 



dither unit 3 



32, 



A. 



' advdot 

- in_tBraet_pa9o 



>■ cpOdot 
- qiOjdtther.val 



> cpl dot 

- qil.dither.val 



> cp2dat 
- op2.dRher_val 



> cpSdot 
- cp3_ditfi8r.VBl 



3i(0-3]_oonstan1 
> cp.avaH 



Rgure 202. Contone dotgen unit 

A dither unit provides the functionality for dithering a single contone plane. The contone image is only 
defined within the contone/spot margin area. As a result, if the input flag injtarget^age is Q. then a con- 
stant contone pixel value is used for the pixel instead of the contone plane. 

The resultant contone pixel is then halftoncd. The dither value to be used in the halftoning process is pro- 
vided by the control data unit The halftoning process involves a comparison between a pixel value and its 
corresponding dither value. If the 8-bit contone value is greater than or equal to the 8-bit dither matrix 
value a 1 is output. If not, then a 0 is output This means each entry in the dither matrix is in the range 1- 
25S (0 is not used). 
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28.4.5 Spot dotgen unit 

The spot dotgen unit is responsible for producing a dot of bi-level data per cycle. It deals with bi^level data 
(and therefore does not need to halftone) that comes from the LBD via the SEU, Like the conione layer, 
the bi-Ievel spot layer is only defined within the contonc/spot margin area. As a result, if input flag 
injarget^ge is 0, then a constant dot value (typically this would be 0) is used for the output dot. 

The spot dotgen unit also produces a s_avail flag which specifies whether or not spot dots are currently 
available for this spot plane, and the output hcu^Ju_advdot to request the SFU to provide the next bi-level 
data value. The spot dotgen unit can be represented by the following pseudocode: 

8_avail s sfu^hcu_&vail 

if (in_target_pag© == 1 AND advdot == 1) then 

hcu_a£u_advdot = 1 • 
else 

hcu_sftJL.advdot » 0 

1£ ( in^target_page b= 1) then 

sp = sfu^hcu^sdata 
else 

sp B sp^constant 

28.4.6 Tag dotgen unit 

This unit is very similar to the spot dotgen unit (see Section 28.4.5) in that it deals with bi-lcvel data, in 
this case from the TE via the TFU. The tag layer is only defined within the tag margin area. As a result, if 
input flag injtag_targetjfage is 0, then a constant dot value, q}_constant (typically this would be 0). is 
used for the output dot The tagplane dotgen unit also produces a tp^avai! flag which specifies whether or 
not tag dots are cuzrently available for the tagplane, and the output hcujtfu^advdot to request the TFU to 
provide the next bi-levcl data value. 



28.4.7 Dot reorg unit 

The dot reorg unit provides a means of moping the bi-level dithered data, the spotO color, and the tag data 
to output inks in the actual printhead. Each dot reorg unit takes a set of 6 I -bit inputs and produces a single 
bit output that represents the output dot for that color plane. 

The output bit is a logical combination of any or all of the input bits. This allows the spot color to be 
placed in any output color plane (including infirared for testing purposes), black to be merged into cyan, 
magenta and yellow (in the case of no black ink in the Memjet printhead), and tag dot data to be placed in 
a visible plane. An output for fixative can readily be generated by simply combining desired input bits. 

The dot reorg unit contains a 64-bit lookup to allow complete fi*eedom with regards to mapping. Since all 
possible combinations of input bits arc accounted for in die 64 bit lookup, a given dot reorg unit can take 
the mapping of other reorg units into account. For example, a black plane leorg unit may produce a I only 
if the contone plane 3 or spot color inputs are set (this effectively composites black bi-level over the con- 
tone). A fixative reoig unit may generate a 1 if any 2 of the output color planes is set (taking into account 
die ms^pings produced by the other reorg units). 

I If dead nozzle replacement is to be used (see section 29.4.2 on page 448), the dot reorg can be pro- 

grammed to direct the dots of the specified color into the main plane, and 0 into the other. If a nozzle is 
then marked as dead in the DNC, swapping the bits between the planes will result in 0 in the dead nozzle, 
and the required data in the other plane. 

If dead nozzle replacement is to be used, and there arc no tags, the TE can be programmed with the posi- 
tion of dead nozzles and the resultant pattern used to direct dots into the specified nozzle row. If only fixed 
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background TFS is to be used, a limited number of nozzles can be replaced. If variable tag data is to be 
used to specify dead nozzles, then lai^gc numbers of dead nozzles can be readily compensated for. 
The dot reorg unit can be used to average out the nozzle usage when two rows of nozzles share the same 
ink and tag encoding is not being used The TE can be programmed to produce a regular pattern (e.g. 0101 
on one line, and 1 0 1 0 on the next) and this pattern can be used as a directive as to direct dots into the spec- 
ified nozzle row. 

Each reorg unit contains a 64-bit lOMapping value programmable as two 32-bit HCU registers, and a set 
of selection logic based on the 6-bit dot input (2* = 64 bits), as shown in Figure 203. 

Input dot 




Figure 203. Block diagram of dot reorg untt 
The mapping of input bits to each of the 6 selection bits is as defined in Table 145. 
Tabfe 145. Mapping of input bits to 6 selection bits 
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29 Dead Nozzle Compensator (DNC) 

29.1 Overview 

The Dead Nozzle Compensator (DNC) is responsible for adjusting Memjet dot data to take account of 
non-fmcuomng nozzles in the Memjet printhead. Input dot data is suppUed from the HCU. and the cor- 
rected dot data is passed out to the DWU. The high level dau path is shown by the block diagram in Figure 
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Figure 204. High level block diagram of DNC 

The DNC condensates for a dead nozzles by performing the following operations: 

• Dead nozzle removal, i.e. turn the nozzle off 

• Ink replacement by direct substitution i.e. K -> K 

• Ink replacement by indirect substitution i:e. K -> CMY 

• Error division to adjacent nozzles 

• Fixative corrections 

The DNC is required to efficiently support up to 5% dead nozzles, under the expected DRAM bandwidth 
allocation, with no restnction on where dead nozzles are located and handle any fixative correction due to 
nozzle conqjensations. Performance must degrade gracefully after 5% dead nozzles. 



29.2 Dead nozzle identification 



Dead nozzles are identified by means of a position value and a mask value. Position information is repre- 
sented by a 10-bit delta encoded format, where the 10-bit value defines the number of dots between dead 
nozzle columns . With the delta information it also reads the 6-bit dead nozzle mask (dn_/nask) for the 
defined dead nozzle position. Each bit in the dn^mask corresponds to an ink plane. A set bit indicates that 
the nozzle for the corresponding ink plane is dead. The dead nozzle table foiroat is shown in Figure 205 
The DNC reads dead nozzle information from DRAM in single 256-bit accesses. A 10-bit delta encoding 
scheme is chosen so that each table entiy is 16 bits wide, and 16 entries fit exactly in each 256-bit read. 
Usmg 10-bit delta encoding means that the maximum distance between dead nozzle columns is 1023 dots 
It IS possible that dead nozzles may be spaced fiirther than 1023 dots from each other, so a null dead nozzle 
Identifier is required. A null dead nozzle identifier is defined as a 6-bit dn^mask of all zeros. These null 
dead nozzle identifiers should also be used so that: 

• the dead nozzle table is a multiple of 16 entries (so that it is aligned to the 256-bit DRAM locations) 



I. for a 10-bit delta value of rf, if the cucicnt column /i is a dead nozzle column then the next dead nozzle column is gh^n by /i + + 1). 
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• tile dead no^c table spans die complete length of the line, i.e. the first entry dead nozzle table should 
have a delta from the first nozzle column in a line and the last entiy in the dead nozzle table should cor- 
respond to the last nozzle column in a line. 

Note that Ae DNC deals witii the width of a page. This may or may not be the same as the width of the 
prmthead (the PHI may mtroduce some margining to the page so that its dot output matches the width of 
the prmthead). Care must be taken when programming the dead nozzle table so tiiat dead nozzle positions 
are correctly specified with respect to the page and printhead. 



16 bits wide 



N dead nozzle 
columns 



Table Entry Structure 




lO-brt Delta Encode 



6-bit OnMask . 



bits 15-6 



bits 5-0 



Frgure 205. Dead nozzle table format 



29,3 DRAM storage and bandwidth requirement 

The memoiy required is largely a factor of the number of dead nozzles present in the printhead (which in 
turn IS a factor of the printhead size). The DNC is required to read a 1 S-bit entry fi-om the dead nozzle table 
for every dead nozzle. Table 146 shows the DRAM storage and average* bandwidth requirements for the 
DNC for different percentages of dead nozzles and different page sizes. 

Table 146. Dead Nozzle storage and average bandwidth lequlrements 
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2.4 



a. Bi-hthic printhead has 13824 nozzles per color providing full bleed printing for A4/Letter 

b. Bi-lithic printhead has 19488 nozzles per color providing full bleed printing for A3 



' " DR^M^^S? ^'"^ even spread of dead nozzles. Clumps of dead nozzles may cause delays due to insufficient available 
DRAM bandwidth. These delays will occur every line causing an accumulative delay over a page. 
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c. 16 bits X 13824 nozzles x O.OS dead 

d, (16 bits read / 20 cycles) = 0.8 bits/cycle 



29.4 Nozzle compensation 



DNC receives 6 bits of dot informatioa every cycle from the HCU, 1 bit per color plane. When the dot 
position corresponds to a dead nozzle column, the associated 6-bit dn_mask indicates which ink plane(s) 
contains a dead nozzle(s). The DNC first deletes dots destined for the dead nozzle. It then replaces those 
dead dots, either by placing the data destined for the dead nozzle into an adjacent ink plane (direct substi- 
tution) or into a number of ink planes (indirect substitution). After ink replacement, if a dead nozzle is 
made active again then the DNC performs error diffusion. Finally, foUov^ring the dead nozzle compensa- 
tion mechanisms the fixative, if present, may need to be adjusted due to new nozzles being activated, or 
dead nozzles being removed 



29.4.1 Dead nozzle removal 



If a nozzle is defined as dead, then the first action for the DNC is to turn off (zeroing) the dot data destined 
for that nozzle. This is done by a bit-wise ANDing of the inverse of the dn^maskwith the dot value. 

29.4.2 Ink replacement 

Ink replacement is a mechanism where data destined for the dead nozzle is placed into an adjacent ink 
plane of die same color (direct substitution, i.e. K -> K^tfrn^, or placed into a number of ink planes, the 
combination of which produces the desired color (indirect substitution, i.e. IC -> CMY). Ink replacement is 
performed by filtering out ink belonging to nozzles that are dead and then adding back in an appropriately 
calculated pattern. This two step process allows the optional re-inclusion of the ink data into the original 
dead nozzle position to be subsequently error diffused. In the general case, fixative data destbed for a dead 
nozzle should not be left active intending it to be later diffused. 

The ink replacement mechanism has 6 ink replacement patterns, one per ink plane, programmable by the 
CPU. The dead nozzle mask is ANDed with the dot data to see if there are any planes where the dot is 
active but the corresponding nozzle is dead The resultant value forms an enable, on a per ink basis, for the 
ink replacement process. If replacement is enabled for a particular ink, the values from the corresponding 
replacement pattern register are ORed into the dot data. The output of the ink replacement process is then 
filtered so that error diffusion is only allowed for the planes in which error diffusion is enabled. The output 
of the ink replacement logic is ORed with the resultant dot after dead nozzle removal. See Figure 210 on 
page 459 for implementation details. 

For example if we consider the printhead color configuration C.M,Y,Ki,K2,IR and the input dot data from 
the HCU is b 101 100. Assuming that the K, ink plane and IR ink plane for this position are dead so the 
dead nozzle mask is bOOOlOl. The DNC first removes the dead nozzle by zeroing the Kj plane to produce 
blOlOOO, Then the dead nozzle mask is ANDed with the dot data to give bOOO 100 which selects the ink 
replacement pattern for K, (in this case the ink replacement pattern for is configured as bOOOOlO, i.e. 
ink replacement into the K2 plane). Providing error diffusion for ^2 is enabled, the output from the ink 
replacement process is bOOOOlO. This is ORed with the output of dead nozzle removal to produce the 
resultant dot blOlOlO. As can be seen the dot data in the defective K, nozzle was removed and replaced by 
a dot in the adjacent nozzle in the same dot position, i.e, direct substitution. 

la the example above the Ki ink plane could be compensated for by indirect substitution, in which case ink 
replacement pattern for K, would be configured as bl 1 1000 (substimtion into the CMY color planes), and 
this IS ORed with the output of dead nozzle removal to produce the resultant dot bl 1 1000. Here the dot 
data in the defective K, ink plane was removed and placed into the CMY ink planes. 
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29.4.3 Error diffusion 

Based on the programming of the lookup table the dead nozzle may be left active after ink replacement. In 
such cases the DNC can compensate using error diffusion. Error diffusion is a mechanism where dead noz- 
zle dot data is difiused to adjacent dots. 

When a dot is active and its destined nozzle is dead, the DNC will attempt to place the data into an adja- 
cent dot position, if one is inactive. If both dots are inactive then the choice is arbitrary, and is determined 
by a pseudo random bit generator. If both neighbor dots are already active then the bit cannot be compen- 
sated by diffusion. 

Since the DNC needs to look at neighboring dots to determine where to place the new bit (if required), the 
DNC worics on a set of 3 dots at a time. For any given set of 3 dots, the first dot received from the HCU is 
referred to as dot A, and the second as dot B, and the third as dot C. The relationship is shown in Figure 
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Figure 206. Set of dots operated on for error diffusion 



For any given set of dots ABC, only B can be compensated for by error diffusion if B is defined as dead. A 
1 in dot B will be diffused into either dot A or dot C if possible. If there is already a 1 in dot A or dot C 
then a 1 in dot B cannot be diffused into that dot. 

The DNC must support adjacent dead nozzles. Thus if dot A is defined as dead and has previously been 
compensated for by error diffusion, then the dot data fi^om dot B should not be diffused into dot A. Simi- 
larly, if dot C is defined as dead, then dot data firom dot B should not be diffused into dot C. 

EiTor difftision should not cross line boundaries. If dot B contains a dead nozzle and is the first dot in a line 
then dot A represents the last dot from the previous line. In this case an active bit on a dead nozzle of dot B 
should not be diffused into dot A. Similarly, if dot B contains a dead nozzle and is the last dot in a line then 
dot C represents the first dot of the next line. In this case an active bit on a dead nozzle of dot B should not 
be diffused into dot C. 

Thus, as a rule, a 1 in dot B carmot be difiused into dot A if 

• a 1 is already present in dot A, 

• dot A is defined as dead, 

• or dot A is the last dot in a line. 

Similarly, a 1 in dot B caimot be diffused into dot C if 

• a 1 is already present in dot C, 

• dot C is defined as dead, 

• or dot C is the first dot in a line. 

If B is defined to be dead and the dot value for B is 0, then no compensation needs to be done and dots A 
and C do not need to be changed. 

If B is defined to be dead and the dot value for B is 1. then B is changed to 0 and the DNC attempts to 
place the 1 from B into either A or C: 

• If the dot can be placed into both A and C, then the DNC must choose between them. The preference is 
given by the current output from the random bit generator, 0 for "prefer left" (dot A) or 1 for "Drefer 
right" (dot C). ^ 
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• If dot can be placed into only one of A and C» then the 1 from B is placed into that position. 

• If dot cannot be placed into eidicr one of A or C, then the DNC cannot place the dot in either position. 
Table 147 shows the truth table for DNC eiror dififusion opetation when dot B is defined as dead. 



Table 147. Error Dmuslon Truth Table when dot B Is dead 
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a. Output from random bit generator. Determines direction of error diffusion (0 = left, 1 = right) 
!>. Bold emphasis is used to show the DNC inserted a 1 

The random bit value used to arbitrarily select the direction of difiiision is generated by a 32-bit maximum 
length random bit generator. The generator generates a new bit for each dot in a line regardless of whether 
the dot is dead or not. The random bit generator can be initialized with a 32-bit programmable seed value. 

29.4.4 Fixative Mrrection 

After the dead nozzle compensation methods have been applied to the dot data, the fixative, if present, may 
need to be adjusted due to new nozzles being activated, or dead nozzles being removed For each output 
dot the DNC determines if fixative is required (using the FixativeRequiredMask register) for the new com- 
pensated dot data word and whether fixative is activated already for that dot. For the DNC to do so it needs 
to know the color plane that has fixative, this is specified by the FixattveMaskI configuration register. 
Table 148 indicates the actions to take based on these calculations. 



Table 148. Truth table for fixative correctJon 
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1 


Output dot as is. 


1 


0 


Clear fixative plane. 


0 


1 


Attempt to add fixative. 


0 


0 


Output dot as Is. 



The DNC also allows the specification of another fixative plane, specified by the FixativeMask2 configura- 
tion register, with FixativeMaskJ having the higher priority over FixativeMask2. When attempting to add 
fixative the DNC first tries to add it into the planes defined by FixativeMaskJ. However, if any of these 
planes is dead then it tries to add fixative by placing it into the planes defined by FixativeMask2, 
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Note thai the fixative defined by FixativeMaskl and FixativeMask2 could possibly be multi-pait fixative, 
i.e. 2 bits could be set in FixativeMaskl with the fixative being a combination of both inks. 
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29.5 Implementation 

A block diagram of the DNC is shown in Figure 207. 
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Figure 207. Block diagram of DNC 
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29.5.1 Definitions of I/O 



Table 149. DNC port IJst and description 











ciocKs and Resets 


pdk 


1 


In 


System Clock. 


prst.n 


1 


In 


System reset synchronous active tow. 


PCUIntCfface 


pcu_dnc_sel 


1 


In 


BJock select from the PCU. When pcLLdfKLsa/ls high both 
pctiLacfrand pcujdataout are valid. 


pcu.fwn 


1 


In 


Coninion read/not-wrtte signal from the PCU. 


pcu_adrf6;2] 


5 


In 


PCU address bus. Only 5 bits are required to decode the 
address space for this k)lock. 


pcu^cfataout(3 1 .-0] 


32 


In 


Shared write data bus from the PCU, 


dnc_pcu_rdy 


1 


Out 


Ready signal to the PCU. When d/TC_j>ct/_rdy Is high It indi- 
cates the last cycle of the access. For a write cyde this 
means pcuLdataoirf has been registered by the block and for 
a read cyde this means the data on dnc _pcu_d3ia is valid. 


dnc.j)cu_data[31 :0] 


32 


Out 


Read data bus to the PCU. 


DIU Interface 


dncdiu^rroq 


1 


Out 


DNC unit requests DRAM read. A read request must be 
accompanied by a valk^ read address. 


dnc_diu,radi(21:5] 


17 


Out 


Read address to DIU. 256-bit word aligned. 


diu_dnc_racfc 


1 


In 


Acknowledge from DIU that read request has been accepted 
arid new read address can be placed on dnc diu radr 


diu.dnQjvalid 


1 


In 


Read data valid, active high. Indicates that valM read data Is 
now on the read data bus, d/ii_data. 


diu.data[63:0| 


64 


In 


Read data from DIU. 


HCU Interface 


dfic.Kcu_ready 


1 


Out 


Indicates that DNC Is ready to accept data from the HCU. 


hcu_dnc_avaD 


1 


in 


IfKlicates vaGd data present on hcujdncjdata. 


hcu_dnc_data[5:0J 


6 


In 


Output bl-level dot data In 6 ink planes. 


DWU interface 


dwu_dnc.ready 


1 


In 


Indicates that DWU is ready to accept data from the DNC. 


dnc_dwu_avail 


1 


Out 


Indicates valid data present on dncjdwu^data. 


dnc.dwu.datalS.-G] 


6 


Out 


Output t}l-level dot data in 6 ink planes. 



29.5.2 Configuration registers 

The configuration registers in the DNC are programmed via the PCU interface. Refer to section 21.8.2 on 
page 257 for the description of the protocol and timing diagrams for reading and writing registers in the 
DNC Note that since addresses in SoPEC are byte aligned and the PCU only supports 32-bit register reads 
and writes, the lower 2 bits of the PCU address bus are not required to decode the address space for the 
DNC. When reading a register that is less than 32 bits wide zeros should be returned on the upper unused 
bit(s) of dnc^pcu^data. Table 150 lists the configuration registers in the DNC. 
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Table 1 50. ONC configuration registers 











Control registers 


t I- nil ru 


0x00 


Reset 


1 


0x1 


A write to this register causes a reset of the 
DNC. 


0x04 


Go 


1 


0x0 


Writing 1 to this register starts the ONC. Writing 
0 to this register halts the DNC. 
When Go is asserted all counters, flags etc. are 
cleared or given their initial value, but configura- 
tion registers keep their values. 
When Go is deasserted the state-machines go 
to their idle states but all counters and configu- 
ration registers keep their values. 
This register can be read to determine If the 
ONC is running 
(1 = running, 0 a stopped). 


Setup registers ( 


constant during processing) 


0x10 


MaxDot 


16 


0x0000 


This is the maximum dot number - 1 present 
across a page. For example if a page contains 
13824 dots, then MaxOof %vill be 13823. 
Note that this number may or may not l)e the 
same as the number of dots across the print- 
head as some margining may be introduced in 
the PHI. 


0x14 


LSFR 


32 


OxOOOO_ 
0000 


The current value of the Li=SR register used as 
the 32-bit maximum length random bit genera- 
tor. 

Users can write to this register to program a 
seed value for the 32-bit maximum length ran- 
dom bit generator. Must CKit be all Is for taps 
implememed in XNOR form, (it is expected that 
writing a seed value wiO not occur during the 
operation of the LFSR). 

This LSFR value couki also have a possible use 
as a random source in program code. 


0x20 


RxativeMaskl 


6 


0x00 


Defines the higher prtority fixative p{ane<8). Bit 0 

represents the settings for plane 0, bit 1 for 

plane 1 etc. For each bit: 

1 = the ink plane contains fixative. 

0 = the Ink plane does not contain fixative. 


0x24 


RxativeMask2 


6 


0x00 


Oefines the tower priority fixative plane(s). Bit 0 

represents the settings for plane 0. bH 1 for 

plane 1 etc. Used only when Fixati\/eMask1 

planes are dead. For each bit 

1 s£ the ink plane contains fixative. 

0 s the ink plane does not contain fixative. 


0x28 


RxativeRequtredMask 


6 


0x00 


identifies the ink planes that require fixative. Bit 

0 represents the settings for plane 0. bit 1 for 
plane 1 ete. For each bit: 

1 = the ink plane requires fixative. 

0 = the ink plane does not require fixative (e.g. 
ink is self-fixing) 
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Tabfe 150. DNC configuration registers 




DnTableStartAdr 



0x0.0000 




Start address of Dead Nozzle Tabfe in DRAM, 
specified in 2S6>bIt words. 



0x34 



DnTableEndAdr 



17 



0x0_0000 



End address of Dead Nozzle Table in DRAM, 
specifted in 256-bIt words, i.e. the location con- 
talning the last entry in the Dead Nozzle TaUe. 
The Dead Nozzle Table should be aligned to a 
256-bit boundary, if necessary it can be padded 
with null entries. 



0x40 - 0x54 



PlaneReptacePat- 
tern(5:0] 



6x6 



0x00 



Defines the ink replacement pattern tor each of 
the 6 ink planes. PtaneReplaGeP&nBm[0] l& the 
Ink reptacement pattern fbr plane 0, PlaneRe- 
ptocePattamflJis the Ink replacement pattern 
for plane l.etc 

For each 6-bft replacement pattern for a plane, 
a 1 in any bit poslttons indicates the alternative 
ink planes to be used tor this plane. 



0x58 



Diffuse Enable 



Ox3F 



Defines whether, after ink replacement, enor 
drffuston is allowed to be performed on each 
plane. 

BitO represents (he settings tor plane 0. bit 1 for 
plane 1 etc. Fbr each bit: 
1 B error diffusion is enat)led 
0 B error diffusion Is disabled 



Debug registers (read only) 



0x60 



DncOutputDebug 



N/A 



Bit 7 = dwujcSnc_rBady 
Bit 6 = dnc^dvnj^avaif 
Bits 5-0 = dnc_dwu_data 



0x64 



DncReplaoeOebug 



14 



N/A 



Bit 13 s odu^teady 
Btt12siru.ava/7 
Bite 11-6 = kiudln^mask 
Bits 5-0 s/n; data 



0x68 



DncDlffiiseDebug 



14 



N/A 



Bit 13 c dwu^dnc^rea<iy 
Bit 12 = <inc_dwu_ay/aii 
Bits 1 1-6 s edu_dn^mask 
Bits 5X) s edu_data 
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29.5.3 ink replacement unit 

Figure 208 shows a sub-block diagram for the ink replacement unit 
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Figure 208. Sub-block diagram of Ink replacement unit 



29.5.3.f Control unit 



The control unit is responsible for reading the dead nozzle table from DRAM and making it available to 
the DNC via the dead nozzle FIFO. The dead nozzle table is read from DRAM in single 2S6-bit accesses, 
receiving the data from the DIU over 4 clock cycles (64*bits per ^cle). The protocol and timing for read 
accesses to DRAM is described in section 20.9.1 on page 208. Reading from DRAM is implemented by 
means of the state machine shown in Figure 209. 

All coimters and flags should be cleared after reset. When Go transitions from 0 to 1 all counters and flags 
should take their initial value. While the Go bit is 1 , the state machine requests a read access from the dead 
nozzle table in DRAM provided there is enough space in its FIFO. 

A modulo-4 counter, rd^count^ is used to count each of the 64.bits received in a 256-bit read access. It is 
incremented whenever diu_dnc_rvalid is asserted. When Go \s 1, dnjtable^radr is set to 
dn_table^tart_adr As each 64-bit value is returned, indicated by diujdnc_rvalid being asserted, 
dnj[able_radr is compared to dn_tablejsnd__adr. 

• If rd_!COunt equals 3 and dn_tabl€_radr equals dnjlable_end_adr^ then dn^tabie^adr is updated to 
dnjtable_fitart_adr. 

• \frd_jcount equals 3 and dn_table_radr does not equal dn_table_end_adr, then dnjiable_radr is incre- 
mented by 1 . 

A count is kq}t of the number of 64-bit values in the FIFO. When diu^dnc^rvaiid is 1 data is written to the 
FIFO by asserting wr_en, and/tfo^contents and fifo_wr_adr are both incremented. 
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V/hcn Jifo_contentsf3:0J is greater than 0 and edu^ready is 1. dncjxcu^ready is asserted to indicate that 
the DNC is ready to accept dots from the HCU. If hcu_dnc^avail is also 1 then a dotadv pulse is sent to the 
GenMask unit, indicating the DNC has accepted a dot from the HCU. and iru_avail is also asserted. After 
Go is set, a single preload pulse is sent to the GenMask unit once the FIFO contains data. 

When a rd_adv pulse is received from the GenMask vuait,fifo_rd_adr{4:0J is then incremented to select 
the next 16-bit value. Ufifojrdjadr[l:0] = 1 1 then the next 64.bit value is read from the FIFO by asserting 
rd^en, 3nd fifo^contentsf 3 :0J is decremented. 



dn table radrl^dn table end adr 



ANP (d OQunt =» 3 



dn_tabte. 



ftaaetOR pffst n c=»0 
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Figure 209. Dead nozzle table state machine 



29.5.3.2 Dead nozzle FIFO 

The dead nozzle FIFO conceptually is a 64-bit input, and 16-bit output FIFO to account for the 64*bit data 
transfers from the DIU» and Uie individual 16-bit entries in the dead nozzle table that are used in the Gen- 
Mask unit In reality, the FIFO is actually 8 entries deep and 64-bits wide (to accommodate two 256-bit 
accesses). 

On the DRAM side of the FIFO the write address is 64-bit aligned while on the GenMask side the read 
address is 1 6-bit aligned, i.e. the upper 3 bits are input as the read address for the FIFO and the lower 2 bits 
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are used to select 16 bits from the 64 bits (1st 16 bits read coircsponds to bits 1 5-0, second 16 bits to bits 
3M6etc.). 

29.5.3.3 GenMaskunit 

The GenMask unit generates the 6-bit dn_mask that is sent to the replace unit. It consists of a 10-bit delta 
counter and a mask register. 

After Go is set, the GenMask unit will receive a preload pulse from the control unit indicating the first 
dead nozzle table entry is available at the output of the dead nozzle FIFO and should be loaded into the 
delta counter and mask register. A rd_adv pulse is generated so that the next dead nozzle table entry is pre- 
sented at the output of the dead nozzle FIFO. The delta counter is decremented every time a dotadv pulse 
is received. When the delta counter reaches 0, it gets loaded with the current delta value output from the 
dead nozzle FIFO, i.e. bits 15-6, and the mask register gets loaded with mask output from the dead nozzle 
FIFO, i.e. bits 5-0. A rd_adv pulse is then generated so that the next dead nozzle table entry is presented at 
the output of the dead nozzle FIFO. 

When tfie delta coimter is 0 the value in the mask register is output as ib&.dn_masky otherwise the dn_mask 
is all Os. 

The GenMask unit has no knowledge of the number of dots in a line, it simply loads a counter to count the 
delta from one dead nozzle colimm to the next. Thus as described in section 29.2 on page 446 the dead 
nozzle table should include null identifiers if necessary so that the dead nozzle table covers the first and 
last nozzle colimm in a line. 

29.5.3.4 Replace unit 

Dead nozzle removal and ink replacement are implemented by the combinatorial logic shown in Figure 
210. Dead nozzle removal is performed by bit-wise ANDing of the inverse of the dn_jnask with the dot 
value. 

The ink replacement mechanism has 6 ink replacement patterns, one per ink plane, programmable by the 
CPU. The dead nozzle mask is ANDed with the dot data to see if there are any planes where the dot is 
active but the corresponding nozzle is dead. The resultant value forms an enable, on a per ink basis, for the 
ink replacement process. If replacement is enabled for a particular ink, the values from the corresponding 
replacement pattern register are ORed into the dot data. The output of the ink replacement process is then 
filtered so that error diffusion is only allowed for the planes in which error diffusion is enabled. 

The output of the ink replacement process is ORed with the resultant dot after dead nozzle removal. If the 
dot position does not contain a dead nozzle then the dn_mask will be all Os and the dot, hcu_,dnc_data^ will 
be passed through unchanged. 
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Figure 210. Logic for dead nozzle removal and Ink replacement 
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29.5.4 Error Diffusion Unit 

Figure 21 1 shows a sub-block diagram for the error diffusion unit. 
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Figure 211. Sub-blocJc diagram of error diffusion unit 



29.5,4.1 Random Bit Generator 



The random bit value used to arbitrarily select the direction of difiRisioni is generated by a maximum length 
32-bit LFSR- The tap points and feedback generation are shown in Figure 212. The LFSR generates a hew 
bit for each dot in a line regardless of whether the dot is dead or not, i.e shifting of the LFSR is enabled 
when advdot equals 1. The LFSR can be initialised with a 32-bit programmable seed value, random^eed. 
This seed value is loaded into the LFSR whenever a write occurs to the RandomSeed register. Note that &e 
seed value must not be all 1 s as this causes the LFSR to lock-up. 



> |3l|30|29|2g|27{26|2s|24|23|22|2li20|l9|l8|l7|l6|is|l^ gUhU|s|4|3|2|l|o 




XNOR 



output 
bit 



Figure 212. IVIaximum length 32-bft LFSR used for random bit generation 



29,5,4.2 Advance Dot Unit 

The advance dot unit is responsible for determining in a given cycle whether or not the error diffuse unit 
will accept a dot from the ink replacement unit or make a dot available to the fixative correct unit and on to 
the DWU. It therefore receives the dwu^dnc^ready control signal from the DWU, the iru^avail flag from 
the ink replacement unit, and generates dnc_dwu_avaii and edujready control flags. 

Only the dwu_dnc_ready signal needs to be checked to see if a dot can be accepted and asserts edu_ready 
to indicate this. If the error diffuse unit is ready to accept a dot and the ink replacement unit has a dot avail- 
able, then a adviht pulse is given to shift the dot into the pipeline in the diffuse unit Note that since the 
error diffusion operates on 3 dots, the advance dot unit ignores dwu_dnc_ready initially until 3 dots have 
been accepted by the difFiise unit. Similarly dncjdwu^avaii is not asserted until the diffuse unit contains 3 
dots and the ink replacement unit has a dot available. 
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29,5.4.3 Diffuse Unit 

The diffuse unit contains the combinatorial logic to implement the tnith table from Table 147. The diffuse 
unit receives a dot consisting of 6 color planes (I bit per plane) as well as an associated 6-bit dead nozzle 
mask value. 

Error diffusion is applied to all 6 planes of the dot in parallel. Since error diffusion operates on 3 dots, the 
diffuse unit has a pipeline of 3 dots and their corresponding dead nozzle mask values. The first dot 
received is referred to as dot A, and the second as dot B, and the third as dot C. Dots are shifted along the 
pipeline whenever advdot is 1 . A count is also kept of the number of dots received. It is incremented when- 
ever ^</(0/ is 1, and wraps to 0 when it reaches maxjdot When the dot count is 0 dot C corresponds to the 
first dot in a line. When the dot count is 1 dot A corresponds to the last dot in a line. 

In any given set of 3 dots only dot B can be defined as containing a dead nozzlc(s). Dead nozzles are iden- 
tified by bits set in iru_dn_mask. If dot B contains a dead nozzle(s), the corresponding bit(s) in dot A, dot 
Q the dead nozzle mask value for A, the dead nozzle mask value for C, the dot count, as well as the ran- 
dom bit value are input to the truth table logic and the dots A, B and C assigned accordingly. If dot B does 
not contain a dead noz^e then the dots are shifted along the pipeline unchanged. 

29.5.5 Fixative Correction Unit 

The fixative correction unit consists of combinatorial logic to implement fixative correction as defined in 
Table 151. For each output dot the DNC determines if fixative is xequixed for the new coD:^>ensated dot 
data word and whether fixative is activated already for that dot. 

FixacivePresenc = ( (FixAtiveMaskl | FixativeMask2 ) & edu^data) != 0 
FixativeRe<iuired » (FixatlveRequiredMask & edu^data) 1= 0 

It then looks up the truth table to see what action, if any, needs to be taken. 



Table 1 51 . Truth table for fixative correction 









mmm^MMm 




1 


Output dot as is. 


dnc.dwu.data s edu.data 




0 


Clear fixative ptane. 


dnc_dwu_data = (edu_data) & -{FixativeMaskI | RxativeMa8k2) 


0 


1 


Anempt to add fixa- 
tive. 


if(FlxativeMask1 & DnMask)lsO 

dnc.dwu_dala « <edu.data) | (FbcatrveMask2 & -DnMask) 
else 

dncjdwu_data = (edu_data) | (FlxativeMaskl) 


0 


0 


Output dot as is. 


dnc_dwu_data = edii_data 



When attempting to add fixative the DNC first tries to add.it into the plane defined by FixativeMaskl. 
However, if this plane is dead then it tries to add fixative by placing it into the plane defined by 
FixativeMask2, Note that if both FixativeMaskl and FixativeMask2 are both all Os then the dot data will 
not be changed. 
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30 Dotline Writer Unit (DWU) 

30.1 Overview 

The Dotline Writer Unit (DWU) receives 1 dot (6 bits) of color information per cycle from the DNC. Dot 
data received is bundled into 256-bit words and transferred to the DRAM. The DWU (in conjunction with 
the LLU) implements a dot line FIFO mechanism to compensate for the physical placement of nozzles in a 
printfaead, and provides data rate smoothing to allow for local complexities in the dot data generate pipe- 
line. 
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dotdata ^ 




dot data 




dotdata ^ 
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OWU 












CMfltfDl 













UU 



Figure 213. High level data flow diagram of DWU in context 



30.2 Physical requirement imposed by the printhead 

The physical placement of nozzles in the printhead means that in one firing sequence of all nozzles, dots 
will be produced over several print lines. The printhead consists of 12 rows of nozzles, one for each color 
of odd and even dots. Odd and even nozzles are separated by D2 print lines and nozzles of different colors 
are separated by Di print lines. See Figure 214 for reference. The first color to be printed is the first row of 
nozzles encountered by the incoming paper. In the example this is color 0 odd, although is dependent on 
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the pririthead type (see Section 35 Memjet Printhead for other piinthead arrangments). Paper passes under 
printhead moving downwards. 



lype 0 printhead IC 
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N6te: Peper passes under printhead 

Figure 214. Printhead Nozzle Layout for conceptual 36 Nozzle bi-lithic printhead 

For example if the physical separation of each half row is SO\xm equating to Dj=D2=5 print lines at 
1600dpi. This means that in one firing sequence, color 0 odd nozzles will fire on dotline L, color 0 even 
nozzles will fire on dotline L-D|, color I odd nozzles will fire on dotline L-DpD2 and so on over 6 color 
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planes odd and even nozzles. The total number of lines fired over is given as 0+5+5 +5* 0 + 1 1x5 =55. 

See Figure 2 1 5 for example diagram. 

I 
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Figure 215. Paper and printhead nozzles relationship (example with Ox^Dr"^) 

It is expected that the physical spacing of ithe printhead nozzles will be 80^m (or 5 dot lines), although 
there is no dependency on nozzle spacing. The DWU is configurable to allow other line nozzle spacings. 



Table 1 52. Relationship between Nozzle color/sense and line firing 






sense 


line 


sense 


line 


Color 0 


even 


L 


even 


L-5 




odd 


L-5 


odd 
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Colon 


even 


L-10 


even 


L-15 




odd 


L-15 


odd 


L-10 


Color 2 


even 


L-20 


even 


L-25 




odd 


L-25 


odd 


L-20 


Color 3 


even 


L-30 


even 


L-35 




odd 


L-35 


odd 


L-30 


Color 4 


even 


L-40 


even 


L-45 




odd 


L-45 


odd 


L-40 


Colors 


even 


L-50 


even 


L-55 




odd 


L-55 


odd 


L-50 



30.3 Line rate de-coupling 

The DWU block is required to compensate for the physical spacing between lines of nozzles. It does this 
by storing dot lines in a FIFO (in DRAM) until such time as they are required by the LLU for dot data 
transfer to the printhead interface. Colors are stored separately because they are needed at different times 
by the LLU. The dot line store must store enough lines to conqsensate for the physical line separation of 
the printhead but can optionally store more lines to allow system level data rate variation between the read 
(printhead feed) and write sides (dot data generation pipeline) of the FIFOs. 
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LLU 

Read 
Side 



A logical representation of the FIFOs is shown in Figure 216, where N is defined as the optional number of 
extra half lines in the dot line store for data rate de-coupling. 
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Figure 216. Dot line store logical representation 



30.4 Dot line store storage requirements 

For an arbitrary page width of d dots (where d is even), the number of dots per half line is d/2. 

For interline spacing of D2 and inter-color spacing of Dj, with C colors of odd and even half lines, the 
number of half line storage is (C - I) (D2+Dj) + Dl. 

For N extra half line stores for each color odd and even, the storage is given by (N ♦ C * 2). 
The total storage requirement is ((C - 1) (D2+D1) + Dl + (N * C ♦ 2)) ♦ d/2 in bits. 



Doc: SoPEC_hardware_design S3 Proprietary Document 29 Nov 2002 

Version: 2.3 — Page 465 




SoPEC ! Hardware Design 



Note that when determining tbe storage requirements for the dot line store, the number of dots per line is 
the page width and not necessarily the printhead width. The page width is often the dot margin number of 
dots less than the printhead width. They can be the same size for full bleed printing. 

For example in an A4 page a line consists of 13824 dots at 1600 dpi, or 6912 dots per half dot line. To 
store just enough dot lines to account for an inter-line nozzle spacing of 5 dot lines it would take 55 half 
dot lines for color 5 odd, 50 dot lines for color 5 even and so on, giving 55+50*f45... 10+5+0= 330 half dot 
lines in total. If it is assumed that N==4 then the storage required to store 4 extra half lines per color is 4 x 
12=48, in total giving 330+48=378 half dot lines. Each half dot line is 6912 dots, at 1 bit per dot give a 
total storage requirement of 6912 dots x 378 half dot lines / 8 bits = Approx 319 Kbytes. Similarly for an 
A3 size page with 19488 dots per line, 9744 dots per half line x 378 half dot lines / 8 = Appiox 899 
Kbytes. 



Table 153. Storage requirement for dot line store 
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378 
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The potential size of the dot line store makes it unfeasible to be implemented in on-chip SRAM, requiring 
the dot line store to be implemented in embedded DRAM. This allows a configurable dotline store where 
unused storage can be redistributed for use by other parts of the system. 



30.S Local buffering 

An embedded DRAM is expected to be of the order of 256 bits wide, which results in 27 words per half 
line of an A4 page, and 54 words per half line of A3. This requires 27 words x 12 half colors (6 colors odd 
and even) = 324 x 256-bit DRAM accesses over a dotline print time, equating to 6 bits per cycle (equal to 
DNC generate rate of 6 bits per cycle). Each half color is required to be double buffered, while filling one 
buffer the other buffer is being written to DRAM. This results in 256 bits x 2 buffers x 12 half colon i.e. 
6144 bits in total. 

The buffer requirement can be reduced, by using 1,5 buffering, where the DWU is filling 1 28 bits while the 
remaining 256 bits are being written to DRAM. While this reduces the lequired buffering locally it 
increases the peak bandwidth requirement to the DRAM. With 2x buffering the average and peak DRAM 
bandwidth requirement is the same and is 6 bits per cycle, alternatively with 1.5x buffering the average 
DRAM bandwidth requirement is 6 bits per cycle but the peak bandwidth requirement is 12 bits per cycle. 
The amount of buffering used will depend on the DRAM bandwidth available to the DWU unit. 
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Figure 217. Comparison of 1.5x v 2x buffering 



write pt 



read pt 



Should the DWU foil to get the required DRAM access within the specified time, the DWU will stall the 
DNC data generation. The DWU will issue the stall in sufficient time for the DNC to respond and still not 
cause a FIFO overrun. Should the stall persist for a sufficiently long time, the PHI will be starved of data 
and be unable to deliver data to the printhead in time. The sizing of the dotline store FIFO and internal 
FIFOs should be chosen so as to prevent such a stall happening. 



30.6 DpTLINE DATA IN MEAflORY 



The dot data shift register order in the printhead is shown in Figure 214 (the transmit order is the opposite 
of the shift register order). In the example the type 0 printhead IC transmit order is increasing even color 
data followed by decreasing odd color data. The type 1 printhead IC transmit order is decreasing odd color 
data followed by increasing even color data. For both printhead ICs the even data is always increasing 
order and odd data is always decreasing. The PHI controls which printhead IC data gets shifted to. 

From this it is beneficial to store even data in increasing order in DRAM and odd data in decreasing order. 
While this order suits the example printhead, other printheads exist where it would be beneficial to store 
even data in decreasing order, and odd data in increasing order, hence the order is configurable. The order 
that data is stored in memory is controlled by setting the CoiorLineSense register. 
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The dot . order in DRAM for increasing and decreasing sense is shown in Figure 21 S and Figure 219 
respectively. For each line in the dot store the order is the same (although for odd lines the numbering will 
be different the order will remain the same). Dot data from the DNC is always received in increasing dot 
number order. For increasing sense dot data is bundled into 256-bit words and written in increasing order 
in DRAM, word 0 first, then word I , and so on to word N, where N is the nimiber of words in a line. 

For decreasing sense dot data is also bundled into 256-bit words, but is written to DRAM in decreasing 
order, i.e. word N is written first then word N-1 and so on to word 0. For both increasing and decreasing 
sense the data is aligned to bit 0 of a word, i.e. increasing sense always starts at bit 0, decreasing sense 
always finishes at bit 0. 

Even Dot Storage in DRAM (Increasing Sense) 
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Figure 218. Even dot order in DRAM (increasing Sense, 13320 fSot wide line) 



Even Dot Storage in DRAM (Decreasing Sense) 
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Figure 219. Even dot order in DRAM (Decreasing Sense, 13320 dot wide line) 



Each half color is configured independently of any other color. The ColorBaseAdr register specifies the 
position where data for a particular dotline FIFO will begin writing to. Note thai for increasing sense col- 
ors the ColorBaseAdr register specifies the address of the first word of first line of the fifo, whereas for 
decreasing sense colors the ColorBaseAdr register specifies the address of last word of the first line of the 
FIFO. 

Dot data received from the DNC is bundled in 256-bit words and transferred to the DRAM. Each line of 
data is stored consecutively in DRAM, with each line separated by ColorLineInc number of words. 

For each line stored in DRAM the DWU increments the line count and calculates the DRAM address for 
the next line to store. 
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Si 



This process continues until ColorFifoSize number of lines are stored, after which the DRAM address with 
wrap back to the ColorBaseAdr address. 



Increasing Sense Colors 
DRAM 



ColorBaseAdr 
(words) 
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DRAM 
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ColorFifoSize = N Unes 



Rgure220. Dotline FIFO data struoture In ORMVI 

As each line is written to the FIFO, the DWU increments the FlfoFUlLevel register, and as the LLU reads a 
line from the FIFO the FifoFillLevel register is decremented. The LLU indicates that it has completed 
reading a line by a high pulse on the llu^dwujine^rd line. 

When the number of lines stored in the FIFO is equal to the MaxWriteAhead value the DWU will indicate 
to the DNC that it is no longer able to receive data (i.e. a stall) by deasserting the dwujncjready signal. 

The ColorEnable register determines which color planes should be processed, if a plane is tumed oiF, data 
is ignored for that plane and no DRAM accesses for that plane are generated. 
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30.7 Implementation 

30.7.1 Definitions of I/O 



Table 154. DWU I/O Definition 







mi 




Clocks and Resets 


pdk 


1 


In 


System Clock 




1 


In 


System reset, synchronous active \ow 


ONC Interface 


dwu_dnc_ready 


1 


Out 


Indicates that DWU is ready to accept data from the DNC. 


dnc_dwu_avall 


t 


In 


Indicates valkJ data present on dfKrdwt^ctata. 


dnc_dwu_data(5 .'OJ 


6 


In 


Input bMevel dot data in 6 Ink planes. 


LLU Interface 


dwuJlu_Dne_wr 


1 


Out 


DWU line write. Indicates that the OWU has completed a full 
line write. Active hlQh 


llfu.dwujine.rd 


1 


In 


LLU line read. Indicates that the IXU has completed a line 
read. Active high. 


LLU and DWU common configuration 


dwu^Du.cfH6size[1 1 :0]t7X>] 


12x8 


Out 


Indicates the number of lines in the Flf=0 before the line 

increment will wrap around in memory. 

Bus 0,1 - Even. Odd Dne colorO 

Bus 2,3 • Even, Odd line color 1 

Bus 4.5 - Even, Odd line color 2 

Bus 6,7 - Even, Odd line color 3 

Bus 8.9 - Even. Odd line color 4 

Bus 10,1 1 - Even, Odd line cotor 5 


PCU Interface 


pcu_dwu_del 


1 


In 


Block select from the PCU. When pcu^dwu^sefls high both 
pcu^adrand pcu^dataout aire valM. 


pcu_fwn 


1 


In 


Common read^not-write signal from the PCU. 


pcu_adTt7:2] 


6 


In 


PCU address bus. Only 6 bits are required to decode the 
address space for this bk>ck. 


pc:u.dataoii1(3 1 :0] 


32 


In 


Shared write data bus from the PCU. 


dwujx:u_rdy 


1 


Out 


Ready signal to the PCU. When dwu_pctt.rdy is high it Indi- 
cates the last cycle of the access. For a write cyde this 
means pcujdataouthaB been registered by the bk>ck and 
for a read cycle this means the data on dwu _pcu data is 
valid. 


dwu_pcu_data[31 :0] 


32 


Out 


Read data Imis to the PCU. 


DIU Interface 


dwu_dhj_wreq 


1 


Out 


OWU requests ORAM write. A write request nruist be accom- 
panied by a valid write address together ¥vith valid write data 
and a write valid. 


dwu_diu_wadit21 5] 


17 


Out 


Write address to OIU 

17 bits wide (25e-blt aligned word) 


d]u_dwu_wack 


1 


In 


Acknowledge from DIU that write request has been 
accepted and new write address can be placed on 
dwu_diu^wadr 
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Table 154. DWU I/O Definition 



i 








dwu.diu_(iata(63:01 


64 


Out 


Data from DWU to DIU. 256-bit word transfer over 4 cycles 
Rrst 64-bits Is bits 63:0 of 256 bit word 
Second 64-bits is bits 1 27:64 of 256 bit word 
Third 64-bIts Is bits 191 :128 of 256 bit word 
Fourth 64-bits is Wts 255:1 92 of 256 bit word 


dwu_diu_wvaJid 


1 


Out 


Signal fronn DWU indicating that data on dwu diu data is 
valid. 
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30.7.2 DWU partition 
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Figure 221. DWU partition 



30.7.3 Configuration registers 

The configuration registers in the DWU are programmed via the PCU interface. Refer to section 21 .8.2 on 
page 257 for a description of the protocol and timing diagrams for reading and writing registers in the 
DWU. Note that since addresses in SoPEC are byte aligned and the PCU only supports 32-bit register 
reads and writes, the lower 2 bits of the PCU address bus are not required to decode the address space for 
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the DWU. When reading a register that is less than 32 bits wide zeros should be retunied on the upper 
unused bit(s) of dwu^cu^data. Table 155 lists the configuration registers in the DWU. 



Table 155, DWU registers descrtptfon 







Mm: 


IPf 




Control Reglst 


ere 




0x00 


Reset 


1 


0x1 


Active low synchronous reset, self de-activating. A 
write to this roQister wit! cause a DWU block reset. 


0x04 


Go 


1 


0x0 


Active higti bit Indicating the OWU is progmmmed 
and ready to use. A low to high transition will cause 
DWU block interna! states to reset (configuration 
registers are not reset). 


OotLbie Store 


Configuration 


0x08-0x38 


ColoreaseAdifll :0] 


12x17 


0x00000 


Specifies the base address (in words) in meoiory 
where data from a particular half color (N) will be 
placed. 


0x3C'0x6C 


CororFlfoSlze[11:0] 


12x8 


0x00 


Indicates the number of lines in the FIFO before 
the line Increment will wrap around in memory. 
Bus 0,1 - Even, Odd line cotor 0 
Bus 2,3 - Even, Odd line cotor 1 
Bus 4,5 - Even, Odd line cok>r 2, 
Bus 6,7 - Even, Odd flne color 3* 
Bus 8,9 - Even, Odd fine ootor 4 
Bus 10.11 - Even. Odd Hne color 5 


0x70 


CotorLineSense 


2 


0x2 


Specifies whether data written to DRAM for this 
half cotor is increasing or decreasing sense 

0 - Decreasing sense 

1 - Increasing sense 

Bit 0 Defines even color sense. 
Bit 1 Defines odd color sense. 


0x74 


ColorEnable 


6 


Ox3F 


Indicates whether a particular cobr is active or not. 
When inactive no data is written to DRAM for that 
cok)r. 

0 - Color off 

1 - Color on 

One bit per color, bit 0 is Cok>r 0 and so on. 


0x78 


Ma)WrfteAhead 


8 


0x00 


Specifies the maximum numtier of lines that the 
DWU can be ahead of the LLU 


Ox7C 


UneSize 


16 


0x0000 


Indicates the number of dots per line. 


Working Registers 


0x80 


UneOotCnt 


16 


0x0000 


IrKlicates the number ot remaining dots In the cur- 
rent line. (Read Only) 


0x84 


RfoRIILevet 


8 


0x00 


Numt>er of tines in the RFO, written to t>ut not 
read. (Read Only) 



A low to high transition of the Go register causes the internal states of the DWU to be reset. All configura* 
tion registers will remain the same. The block indicates the transition to other blocks via the dwu_go .jntlse 
signal. 



The ColorLineInc bus specifies the number of addresses (in 256-bit words) between successive half lines 
in the dot line store. It is derived from the LineSize register by rounding up the nearest 256-bit value. The 
same value used for all half colors. 

if (line_size(7:0) 1=0 ) then 

color_line_inc(7:0J » line_size(lS : 8) + 1 
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else 

color_line_incC7:0] = line_si2e[15:8 J j 



30.7.4 Fifo fill level 

The DWU keeps a ninning total of the number of lines in the dot store FIFO. Each time the DWU writes a 
line to DRAM (determined by the DIU interface subblock and signaUcd via iinejwr) it increments the 
fiillevei and signals the line increment to the LLU (pulse on dwujiujinej^y Conversely if it receives an 
active llu_dwu_!me_ni pulse from the LLU, fhcjilllevel is decremented. If the fiUlevel increases to the pro- 
grammed max level (max_\vrite_ahead) then the DWU stalls and indicates back to the DNC by de-assert- 
ing the dwu^dnc^ready signal. 

If one or more of the DIU buffers fill, the DIU interface signals the fill level logic via the buTJull signal 
which in turn causes the DWU to de-assert the dwu_dnc_ready signal to stall the DNC. The bufjull sig- 
nals will remain active until the DIU services a pending request from the full bufiFer, reducing the buffer 
level. 

The DWU does not increment the fill level until a complete line of dot data is in DRAM not just a com- 
plete line received from the DNC. This ensures that the LLU cannot start reading a partial line from 
DRAM.before the DWU has finished writing the line. 

The fill level is reset to zero each time a new page is started, on receiving a pulse via the dwu ^jmlse 
signal. 

The line fifo fill level can be read by the CPU via the PCU at any time by accessing the FifoFillLevel regis- 
ter. 
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30.7.5 Buffer address generator 
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^ wrj)jl(1 1(63:0] 
^ wr_adrl1|3:0] 



Figure 222. Buffer address generator sub-block 



30- 7.5. i Buffer address generator description 

The buffer address generator subblock is responsible for accepting data from the DNC and writing it to the 
DIU buffers in the correct order. 

The buffer address and active bit-write for a particular dot data write is calculated by the buffer address 
generator based on the dot count of the current line, programmed sense of the color and the line size. 

All configuration registers should be progranuned while the Go bit is set to zero, once complete the block 
can be enabled by setting the Go bit to one. The transition from zero to one will cause the internal states to 
reset 

If the color_line_sense signal for a color is one (i.e. increasing) then the bit-wiite generation is straight 
forward as dot data is aligned with a 256-bit boundary. So for the first dot in that color, die bit 0 of the 
wrj>it bus will be active (in buffer word 0), for the second dot bit 1 is active and so on to the 255* dot 
where bit 63 is active (in buffer word 3). This is repeated for all 256-bit words \mtil the final word where 
only a partial number of bits arc written before the word is transferred to DRAM 

If color_iinejsense signal for a color is zero (i.e. decreasing) the bit-write generation for that color is 
adjusted by an offset calculated from the pre-programmed line length {line _^ize). The offset adjusts the bit 
write to allow the line to finish on a 256-bit boundary. For example if the line length was 400, for the first 
dot received bit 7 (line length is halved because of odd/even lines of color) of the wrjbit is active (buffer 
word 3), the second bit 6 (buffer word 3), to die 200* dot of data with bit 0 of wr bit active (buffer word 
0). 
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30.7.5.2 Bit-write decode 

The buffer address generator contains 2 instances of the bit-write decode, one configured for odd dot data 
the other for even. The counter (cither up or down counter) used to generate the addresses is selected by 
I the colorjine^sense signal. Each block determines if it is active on this cycle by comparing its configured 

type with the current dot count address and the datajactive signal. 

The wrjyit bus is a direct decoding of the lower 6 count bits (countf6:IJ), and the DIU buffer address is 
the remaining higher bits of the counter (count fI0:7J). 

The signal generation is given as follows: 
// decermine the counter to use 
I if (color.line.sense == 1 ) 

count e up.cnt(10:0] 
else 

count = dn_cnt [10:03 
// determine if active, based on instance type 

= data^active & (count [0] odd^even^type) // odd =1, even «0 
// detertnine the bit wite value 
wr_bit(6a:0] » decode (count [6:1] ) 
// determine the buffer 64-bit address 
wr_adr[3:0J « count (10: 7] 



30.7.5.3 Up counter generator 

The up counter increments for each new dot and is used to determine the write position of the dot in the 
Dru buffers for increasing sense data. At the end of each line of dot data (as indicated by line Jin), the 
counter is rounded up to the nearest ZSe^bit word boundary. This causes the DIU buffers to be flushed to 
DRAM including any partiaUy filled 256-bit Words. The counter is reset to zero if the dwujo_puhe is 
one. 

// Up-Count:er Logic 

if (dwu^o^ulse == 1) then ( 

up_cnt(10:0] a 0 
elsif (line_fin == 1 ) then 

// round up 

if (up_cnt(8:l) O) 
up.cntClOiSIi-t- 

else 

up_cnt(10:9J 

// bit -selector 

up_cnt [7:0] =0 

elsif ( (dnc_dwu_avail == 1) and (dwu.dnc«ready «« 1 ) ) then 
up-Cnt (7:0] 



30.7.5.4 Down counter generator 

The down counter logic decrements for each new dot and is used to detennine the write position of the dot 
in the DUI buffers for decreasing sense data. When the dwu^go _jmlse bit is one the lower bits (i.e. 8 to 0) 
of the counter are reset to line size value (Une^ize), and the higher bits to zero. The bits used to detennine 
the bit-write values and 64-bit word addresses in the DIU buffers begin at line size and count down to zero. 
The remaining higher bits are used to determine the DIU buffer 256-bit address and buffer fill level, begin 
at zero and count up. The counter is active when valid dot data is present, i.e. dnc_dwujavail equals 1. 

When the end of line is detected {line^n equals 1) the counter is rounded to the next 256-bit word, and the 

lower bits are reset to the line size value. 

//Down-Counter Logic 

if tdwu_go_pulse == 1) then 
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dn.cnt(8:0] « line_size [8 : 0} 
dn_cnttl0:9) » 0 
elsif {line_fin == X.) then 
// perform rounding up 
if (dln«cnt[8:13 != 0) 

dn_cnt[10:91i'+ 
else 

dn_cnt(10:9] 
// bit-select is reset 

dn_cnt[8:0jnline_8izet8:0j // bit select bits 
elsif ( (dnc_dwu_avail == 1) AND (dwu_dnc_ready == 1 ,) ) then 
dn_cnt(8:0J — 
dn.cnt[lO:9]+* 



30,7.5,5 Dot counter 



The counter is reset to Une^ize when dwu^o _pulse is 1 




30.7.6 DIU buffer 
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30.7.7 DIU interface 
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Figure 223. DIU Interface sub-block 



30,7,7.1 DIU interface general description 

The DIU interface determines when a buffer needs a data word to be transferred to DRAM. It generates the 
DRAM address based on the dot line position, the color base address and the other programmed parame- 
ters. A write request is made to DRAM and when acknowledged a 256-bit data word is transferred. The 
interface determines if further words need to be transferred and repeats the transfer process. 

If the FIFO in DRAM has reached its maximum level, or one of the buffers has temporarily filled the 
DWU will stall data generation ftom the DNC. 

A similar process is repeated for each line until the end of page is reached. At the end of a page the CPU is 
required to reset the internal state of the block before the next page can be printed. A low to high transition 
of the Go register will cause the internal block reset, which causes all registers in the block to reset with 
the exception of the configuration registers. The transition is indicated to subblocks by a pulse on 
dwu^go^pulse signal. 
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30.7.7.2 Interface controMer 
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Figure 224. Interface controller state diagram 

The interface controller state machine waits in Idle state until an active request is indicated by the read 
pointer (via the req_acHve signal). When an active request is received the machine proceeds to the Col- 
orSelect state to determine which buffers need a data transfer. In the ColorSelect state it cycles through 
each color and determines if tfie color is enabled (and consequently the buffer needs servicing), if enabled 
it jumps to the Request state, otherwise the colorjcnt is incremented and the next color is checked. 
In the Request state the machine issues a write request to the DIU and waits in the Request state until the 
write request is acknowledged by the DIU {diu_dwu_wack). Once an acknowledge is received the state 
machine clocks through 4 cycles transferring 64-bit data words each cycle and incrementing the corre- 
sponding buffer read address. After transferring the data to the DIU the machine returns to the ColorSelect 
state to determine if further buffers need servicing. On the transition the controller indicates to the address 
generator {adr^update) to update the address for that selected color. 

If all colors are transferred (color_cnt equal to 6) the state machine returns to Idle^ i^dating the last word 
flags (group Jih) and request logic (reqjupdate). 

The dwu_diu_wvalid signal is a delayed version of the buf_rd_en signal to allow for pipeline delays 
between data leaving the buffer and being clocked through to the DIU block. 

The state machine will return from any state to Idle if the reset or the dwu_go,jmlse is 1 : 
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30.7.7,3 Address generator 



The address generator block maintains 12 pointers (color^adrfj J :0J) to DRAM corresponding to current 
write address in the dot line store for each half color. When a DRAM transfer occurs the address pointer is 
used first and then updated for the next transfer for that color. The pointer used is selected by the reg sel 
bus, and the pointer update is initiated by the adrjupdate signal from the interface controller. 
The pointer update is dependent on the sense of the color of that pointer, the pointer position in a line and 
the line position m the FIFO. The programming of the color Jbase^adr needs to be adjusted depending of 
the sense of the colors. For increasing sense colors the color Jbase_fldr specifies the address of the first 
word of first line of the fifo. whereas for decreasing sense colors the color J>ase^adr specifies the address 
of last word of the first line of the FIFO. "^^a 

For increasing colors, the initialization value (i.e. when dwu^o^be is 1) is the color_base adr For 
each word that is written to DRAM the pointer in incremented. If the word is the last word in I line (as 
indicated by /o^cW from that read pointers) the pointer is also incremented. If the woid is the last word in 
a hne and the line is the last line in the HFO Cmdicated hyfifo^end from the line counter) the pointer is 
xeset to colorjbase^adr. ^ r 

In the case of decreasing sense colors, the initialization value (i.e. when dvm _go^ulse is 1) is the 
color J,ase_adr, For each line of decreasing sense color data the pointer starts at the Une end and dccre- 
inents to the Ime start. For each word that is written to DRAM the pointer is decremented. If the word is 
the ast word m a line the pointer is incremented by color Jinejnc • 2 + 1. One line length to account for 
the hne of data just written^ and another line length for the next line to be written. If the word is the last 
word in a Ime. and the line is the last Une in the FIFO the pointer is reset to the initialization value ({ e 
color Jbase^adr), ^ ' * 

The address is calculated as follows: 

if (dwu_go^ul8e == 1) then 

color_adr(ll:03 = color_bose_adr [11 rO) [21:51 
elsif («dr_update «= 1) then ( 

// determine the color 

color o re<i_sel(3:0] 

// line end and fifo wrap 

if (<fifo_end(color) «= i) and (last_wd == 1)) then { 
// line end and fifo vnrap 

color^adrt color! « color_bage^adr [color J C21 : 5] 

elsif ( last_wd == 1) then ( 

// just a line end no fifo wrap 

if (color_line^sense [color % 2] == 1) then // increasing sense 
color_adr [color) 

// decreasing sense 
^ color.adrtcolorj - color.adr [colorj + < color.line.inc ♦ 2) + 1 

else < 

//. regular word write 

if <color_line.sense[color % 2) 1) then // increasing sense 
color^adr (color] 

// decreasing sense 

color_adr [color) — 

) 

) 

// select the correct address* for this transfer 
<*wu_diu_wadr = color_adr {req_sel) 
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30.7,7.4 UnB count 

The line counter logic counts the number of dot data lines stored in DRAM for each colon A separate 
pomter is maintained for each color. A line pointer is updated each time the final word of a line is trans- 
feucd to DRAM. This is determined by a combination oiadr_update and lastj^d signals. The pointer to 
update is indicated by the req^el bus. 

When an update occurs to a pointer it is compared to zero, if it is non-zero the count is decremented, oth- 
erwise the counter is reset to color Jtfojsize. If a counter is zero ^Qfifo^end signals is set high to indicates 
to the address generator block that the line is the last line of this colors fifo. 

If the dwu^o^ulse signal is one the counters are reset to color Jifo _jize. 

if (dwuL-9o_pulse «« 1> then 

lino_cnt(ll:0) « color^f ifo_sizeCll:OJ 
elsif ((adr_updat« == 1) AND <last_wd == 1)> then { 

// determine the pointer to operate on 

color = req_sel(3:0I 

// update the pointer 

if (line^cnt (color) sa 0) then 

line^cnt (color] = color^fiCo^size (color} 

else 

line_cnt[i) — 

•) 

// count is zero its the last line of fifo 
for(ieO ;i <12;i+*){ 

fifo_end(i) = (line_cnt(i3 .== 0> 

) 

30.7.7.5 Read Pointer 

The read pointer logic maintains the buffer read address pointers. The read pointer is used to determine 
which 64-bit words to read from the buffer for transfer to DRAM. 

The read pointer logic compares the read and write pointers of each DIU buflfer to determine which buffers 
recjuire data to be transferred to DRAM (pend[IJ:0] bus), and which buffers are full (the bufjull signal). 
Only enabled buffers are considered as indicated by the color^enable bus. 

Buffers are grouped into odd and even buffers groups. If an odd buffer requires DRAM access the 
oddj>end signals will be active, if an even buffer requires DRAM access the even^end signals will be 
active. If both odd and even buffers require DRAM access, the even buffers will get serviced fiist 
If any buffer requires a DRAM transfer, the logic will indicate to the interface controlla: via the req^active 
signal, with the odd^even^l signal determining which group of buffers get serviced. The interface con- 
troller will check the color^enable signal and issue DRAM transfers for all enabled colors in a group. 
When the transfers are complete it tells the read pointer logic to update the requests pending via 
req_update signal. 

The r€qj5el[3:0] signal tells the address geneiator which buffer is being serviced, it is constructed from 
tiie odd_even^el signal and die color jont[2:0] bus from the interface controller. When data is being trans- 
ferred to DRAM the word pointer and read pointer for the corresponding buffer are updated. The req^el 
determines which pointer should be incremented. 
// determine which buffers need updates 
for( i=0; i<12; i*+) { 

// determine if re<iuest is active, filtered by color enable 

if < wr_adr(ij (3 :2J lt= rd.adr [i] (3 :2 J ) 
pend(i] = color_enable(i / 2 J 

else 

penddl = 0 
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// determine if any enabled buffer is full 

if {(wr„adr[i)C3:0) - r<a_adr I i) (3 : 0] ) > 7) AND (color.enable [i / 2J ==s 1)) then 
buf_£ull = 1 

) 

// Ckid half colors (1,3,5,7.9,11). even half colors (0,2,4.6,8,10) 
odcjjiend - ( pendU) | pendl3J | pend(5) | pend[7] | pendt9J | pend(113 ) 
even^end = ( pendlOJ | pend(2] | pend[4] | pend{6} | pend(8] | pend[10] ) 
// fixed servicing order, only update when controller dictates so 
if (req^update == 1) then ( 

if (ovenj>end == 1) then // even always first 

od4.even^sel s 0 
req^active = 1 
elaif (od4_pend 1 ) then // then check odd 
od4_even^sel & 0 
req^active « 1 

// nothing active 

odd_even.sel = 0 
req^active = 0 

) 

// selected requestor 

req_sel[3:0| = {color_cnt [2 :0) , od<3Leven_sel > // concatentation 

The read address pointer logic consists of 12 2-bit counters and a word select pointer. The pointers are 
reset when dwu^o^ulse is one. The word pointer {word^tr) is common to all buffers and is used to read 
out the 64-bit words from the DIU buffer. It is incremented when buf^rd^en is active. If the word^tr is 3 
and rhcbuLrd^en is active the selected read pointer {rd^trfreqjselj) will be incremented. A concatena- 
tion of tiie read pomter and the word pointer are use to construct the buffer read address. The read pointers 
are not reset at the end of each line. 
// determine which pointer to update 
if (dwu_go_pulse == 1) then 

rdj>trfll:0] » 0 

word__ptr « 0 

elsif (buf_jrd_en == 1) then { 

word_jptr++ 

if (word_j)tr == 3 ) then 
r<ljitr [ req„sel 1 

} 

// create the address from the pointer, and word reader 
rd^adrtreq^sel) = {rdj>tr (req_sel) ,worduPtr} // concatenation 

The read pointer block determines if the word being read from the DIU buffers is the last word of a line. 
The buffer address generator indicate the last dot is being written into the buffers via the line Jin signal 
When received the logic marks the 256.bit word in the buffers as the last word. When the last word is read 
from the DIU buffer and transferred to DRAM, the flag for that word is reflected to the address generator 
// line end set the flags 
if (dwu_go__pulse == 1) then 

last_flag[l:01 (1:0) = 0 
elsif (line_fin == 1 ) then 

//'determines the current 256-bit word even been written to 

last^flaglO) Iwr.adrtO) {21 J =1 // even group flag 

// determines the current 256-bit word odd been written to 

last.£lag(lHwr.adr(l) [2] ] =1 // odd ^oup flag 
// last word reflection to address generator 
last.wd «= laat_flag(od4_cven_selJ (rcLptr [req„sel) (OJ) 
// clear the flag 
if <group_fin 1 ) then 

last_flag[odd_even_6elHrd_ptr[req_sell [01 ) « 0 

When a complete line has been written into the DIU buffers (but has not yet been transferred to DRAM) 
the buffer address generator block will pulse the line Jin signal. The DWU must wait until all enabled 
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buffers arc transferred to DRAM before signaling the LLU that a complete line is available in the dot line 

store idwujlujine^wr signal). When the line Jin is received all buffers will require transfer to DRAM. 

Due to the aAitration. the even group will get serviced first then the odd. As a result the line finish pulse to 

the LLU is generated fi-om the last Jiag of the odd group. 

// must be odd, odd group transfer complete and the last word 

dMu_llu.line_wr = odd^even^sel AND group_£in AND last_jwd 
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31 Line Loader Unit (LLU) 



31.1 Overview 



The Line Loader Unit (LLU) reads dot data from the line buflFcrs in DRAM and structures the data into 
even and odd dot channels destined for the same print time. The blocks of dot data are transferred to the 
PHI and then to the printhead. Figure 225 shows a high level data flow diagram of the LLU in context 
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Figure 225. High level data flow diagram of LLU In context 



31 ,2 Physical requirement imposed by the printheao 

The DWU re-orders dot data into 12 separate dot data line FIFOs in the DRAM. Each HFO corresponds to 
6 colors of odd and even data. The LLU reads the dot data line FIFOs and sends the data to the printhead 
interfece. The LLU decides when data should be read from the dot data line FIFOs to correspond with the 
time that the particular nozzle on the printhead is passing the current line. The interaction of the DWU and 
LLU with the dot line FIFOs compensates for the physical spread of nozzles firing over several lines at 
once. For further explanation see Section 30 DotUne Writer Unit (DWU) and Section 32 PrintHead Inter- 
face (PHI). Figure 226 shows the physical relationship of nozzle rows and the line time the LLU starts 
reading from the dot line store. 
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Figure 226. Paper and printhead nozzles relationship (example with D^bDj-S) 
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Within each line of dot data the LLU is required to generate an even and odd dot data stream to the PHI 
block. Figure 227 shows the even and dot streams as they would map to an example bi-lithic printhead. 
The PHI block determines which stream should be directed to which printhead IC. 
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Figure 227. Printhead structure and dot generate order 
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31 .3 Dot generate and transmit order 

The structure of the printhead ICs dictate the dot transmit order to each printhead IC. The LLU reads data 
from the dot line FIFO, generates an even and odd dot stream which is then te-oidered (in the PHI) into the 
transmit order for transfer to the printhead. 

The DWU separates dot data into even and odd half lines for each color and stores them in DRAM. It can 
store odd or even dot data in increasing or decreasing order in DRAM. The order is programmable but for 
descriptive purposes assume even in increasing order and odd in decreasing order. The dot order structure 
in DRAM is shown in Figure 219. 

The LLU contains 2 dot generator units. Each dot generator reads dot data from DRAM and generates a 
stream of odd or even dots. The dot order may be increasing or decreasing depending on how the DWU 
was programmed to write data to DRAM. An example of the even and odd dot data streams to DRAM is 
shown in Figure 228, In the example the odd dot generator is configured to produce odd dot data in 
decreasing order and the even dot generator produces dot data in increasing order. 

The PHI block accepts the even and odd dot data streams and reconstructs the streams into transmit order 
to the printhead. 

The LLU line size refers to the page width in dots and not necessarily the printhead width. The page width 
is often the dot maigin number of dots less than the printhead width. They can be the same size for fiill 
bleed printing. 
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Generate dot order (to the PHI) 
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Example: Una wtth 13624 dots, with 7:3 printhead 
Figure 228. Dot data generated and transmitted order 
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31.4 LLU START-UP 

At tlie start of a page the LLU must wait for the dot line store in DRAM to fill to a configured level (given 
by FifoReadThreshold) before starting to read dot data. Once the LLU starts processing dot data for a page 
it must continue until the end of a page, the DWU (and other PEP blocks in the pipeline) must ensure there 
is always data in the dot line store for the LLU to read, otherwise the LLU will stall, causing the PHI to 
stall and potentially generate a print error. The FifoReadThreshold should be chosen to allow for data rate 
mismatches between the DWU write side and the LLU read side of the dot line FIFO. The LLU will not 
generate any dot data until FifoReadThreshold level in the dot line FIFO is reached. 

Once the FifoReadThreshold is reached the LLU begins page processing, the FifoReadThreshold is 
ignored from then on. 

When the LLU begins page processing it produces dot data for all colors (although some dot data color 
may be null data). The LLU compares the line count of the current page, when the line count exceeds the 
ColorRelLine configured value for a particular color the LLU wall start reading from that colors FIFO in 
DRAM. For colors that have not exceeded the ColorRelLine value the LLU will generate null data (zero 
data) and not read firom DRAM for that color. ColorRelLine [N] specifies the number of lines separating 
the li^ half color and the first half color to print on that page. 

For the example printhead shown in Figure 226, color 0 odd will start at line 0. the remaining colors will 
all have null data. Color 0 odd will continue with real data until line 5, when color 0 odd and even will 
contain real data the remaining colors vnW contain null data. At line 10, color 0 odd and even and color 1 
odd will contain real data, with remaining colors containing null data. Every 5 lines a new half color will 
contain real data and the remaining half colors null data until line 55, when all colors will contain real 
data. In the example ColorRelLinefOJ «5, ColorRelLine fJJ =0, ColorRelLine [2] «15, ColorRelLine [3] 
=10.. etc. 
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It is possible to turn off any one of the color planes of data (via the ColorEnable register), in such cases the 
LLU will generate zeroed dot data information to the PHI as normal but wiU not read data from the 
DRAM. 



31.4.1 LLU bandwidth requirements 



The LLU IS required to generate data for feeding to the printhead interface, the rate required is dependent 
on the pnnthead construction and on the line rate configured. The maximum data rate the LLU can pro- 
duce IS 1 2 bits of dot data per cycle, but the PHI coniumes at 1 2 bits per phiclk cycle (2/3 pclk rate) i e 8 
bits pevpclk cycle. Therefore the DRAM bandwidth requirement for a double buffered LLU is 8 bits' per 
eye e on average. If 1.5 bufifering is used then the peak bandwidth requirement is doubled to 16 bits per 
<^cle but the average remains at 8 bits per cycle. Note that while the LLU and PHI could pxx>duce data at 
the 8 bits per cycle rate, the DWU can only produce data at 6 bits per cycle rate 



Doc: SoPEC^hardware^design 
Version: 2.3 



S3 Proprietsuy Document 



^ Nov 2002 
Page 487 



SoPEC : Hardware Design 



31.5 Implementation 



31.5.1 LLU partition 



Uu.dhJ^radr 



diu_data 

<flu_Ru_rvand- 
diu_nu_rack - 



dwu_nu.cmb8bJ^ 



12x8^ r 



DIU 

Interface 



I 



ye/8/'i2x17 



wrjdata 



wr_en 



wr_adr 



x6 
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Figure 229. LLU partition 



31 .5.2 DeTinitions of I/O 

Table 156. LLU I/O definition 











Clocks and Resets 


pdk 


1 


In 


System dock 




1 


In 


System reset, synchrorwus active low 


PHI Interface 
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Table 156. LLU UO definition 











Huj>hL<fata[1:0][5:0] 


2x6 


Out 


Dot Data from LUI to the PHI. each bit is a color plane 5 downio 0. 
Bus 0 • Even dot data stream 
Bus 1 - Odd dot data stream 

Data is active when corresponding bit is active In tiu _p/tLava// bus 


phUlu.rea(ly[1:0] 


2 


In 


Indicates that PHI ts ready to accept data from the LLU 

0 - Even dot data stream 

1 • Odd dot data stream 


l»u_phravail[1:0] 


2 


Out 


Indicates valid data present on corresponding Uu,j>hLdata. 

0 - Even dot data stream 

1 - Odd dot data stream 


OIU Interfece 


llu_diu_rreq 


1 


Out 


LLtJ requests DRAM read. A read request must be accompanied 
by a valid read address. 


nu_dlu_fadrI21 5] 


17 


Out 


Read address to DIU 

17 bits wide (256-bit aligned word). 


diu_llu_fack 


1 


In 


Acknowledge from OIU that read request has been accepted and 
new read address can be placed on Hu^tSu^mdr 


diu_data[63:0] 


64 


In 


Data from DIU to LLU. Each access is 256-bits received over 4 
dock cydes 

First 64-bits Is bits 63:0 of 256 bit word 
Second 64-bits is bits 127:64 of 256 bit word 
Third 64^it8 is bits 191:128 of 256 bit word 
Fourth 64-bit8 Is bits 255:1 92 of 256 bit word 


diu_nu_fvafld 




In 


Signal from DIU telling LLU that valid read data Is on the diu_data 
bus 


DWU Interface 


dwujlujin8_wr 




In 


DWU line write. Indicates that the DWU has completed a full line 
write. Active high 


llu_dwujlne_rd 




Out 


LLU line read. lndk»tes that the LLU has completed a line read. 
Active high. 


dwujlu.cmbsize[1 1K)][7:0] 


12x8 


rn 


Indicates the number of lines in the FIFO before the line Increment 
wUI wrap around in menrwry. 


PCU Interface 


pcujiu.sel 




in 


Block select from the PCU When pcujtu^sel is high both pcujaxir 
and pcuudatsoutare valid. 


pcu^fwn 




In 


Common read/not-write signal from the PCU. 


pcu_adit7.-2] 


6 


in 


PCU address bus. Only 6 bits are required to decode the address 
space for this bk>ck. 


pcu_dataout(3l:0] 


32 


in 


Shared write data bus from the PCU. 


llu_pcu_rdy 


1 


Out 


Ready signal to the PCU. When Uu j^cu^rcfyls high It Indicates the 
last cycle of the access. For a write cyde this means pcu^dataout 
has t>een registered by the block and for a read cycle this means 
the data on Hu_pcu_data is valid« 


Uu_pcu_data{31:0] 


32 


Out 


Read data bus to the PCU. 



31.5.3 Configuration registers 

The configuration registers in the LLU arc programmed via the PCU interface. Refer to section 21.8.2 on 
page 257 for a description of the protocol and timing diagrams for reading and writing registers in the 
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LLU. Note that since addresses in SoPEC are byte aligned and the PCU only supports 32-bit register reads 
and writes, the lower 2 bits of the PCU address bus are not required to decode the address space for the 
LLU. When reading a register that is less than 32 bits wide zeros should be returned on the upper unused 
bit(s) of liu^?cu^data. Table 157 lists the configuration registers in the LLU. 



Table 157. LLU registers description 













Oontrol Regist 


ere 


0x00 


Reset 


1 


0x1 


Active low synchronous reset, setf <Je-activat'ng. A 
write to this register wID cause a LLU blodc reset 


0x04 


Go 


1 


0x0 


Active high btt incficattng the LLU Is programmed and 
ready to use. A low to high transition will cause LLU 
block internal states to reset 


Configuration 




UXUO ' UXOO 


ColoreaseAdr{11:0] 


12x17 


OxOOOD 
0 


Specifies the base address (in words) in memory 
where data from a particular half cok>r (N) will be 
placed. 


Ox3C 


ColofEnable 


6 


0x3F 


Indicates whether a particular color Is active or not. 
When Inactive no data Is written to ORAM for thai 
color. 

0 - Color off 

1 • Color on 

One bit per cotor, bit 0 is Co4or 0 and so on. 


0x40 


UneSize 


16 


0x0000 


Indicates the number of dots per line. 


0x44 


FifoReadThreshold 


8 


0x00 


Specifies the number of l!nes thai should be in the 
RFO before the LLU starts reading. 


0x48-0x78 


ColorRelUne[11:0] 


12x8 


0x00 


Specifies the relative number of ines to wait from the 
first before starting to read dot data from the corre- 
sponding dot data FIFO 
Bus 0,1 - Even. Odd line color 0 
Bus 2»3 - Even, Odd fine color 1 
Bus 4.5 - Even. Odd fine cotor 2 
Bus 6.7 - Even. Odd itne color 3 
Bus 8.9 - Even. Odd line cofor 4 
Bus 10,11 - Even. Odd line color 5 


Working Registers 


0x7C 


RfoRIILevel 


8 


0x00 


Number of lines in the dot line FiFO. line written in but 
not read out. (Read Only) 



A low to high transition of the Go register causes the internal states of the LLU to be reset All configura- 
tion registers will remain the same. The block indicates the transition to other blocks via the llu _go_pulse 
signal. 



The ColorLineInc bus specifies the number of addresses (in 256-bit words) between successive half lines 
in the dot line store, is used to determine when a half line of data is read from DRAM. It is derived from 
the LineSize register by rounding up the nearest 256-bit value. The same value used for all half colors. 

if <line_sizo(7:0] t=0 ) then 

color_aine_inc[7s0J = line_slze(l5:8] ♦ 1 
else 

color_line_inct7:0) = line_si«e[lS : 8) ; 
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31 .5.4 Dot generator 




line.sizB • 



Dot count 



dot._Bctive 



►douavail 



bit sei 



Extfifna] Array 



fd_adr ^ 






6 












> 





Figure 230. Dot generator RTL Diagram 

The dot generator block is responsible for reading dot data from the DIU buffers and sending the dot data 
in the correct order to the PHI block. The dot generator waits for Hubert signal from the fifo fill level block, 
once active it starts reading data from the 6 DIU buffers and generating dot dau for feeding to the PHI. 

In the LLU tixere are two instances of the dot generator, one generating odd data and the other generating 
even data 

At any time the ready bit from the PHI could be de-asserted, if this happens the dot generator will stop 
generating data, and wait for the ready bit to be iv*asseited. 



31.5.4.i Dot count 



In normal operation the dot counter will wait for the llu_en and the ready to be active before starting to 
count. The dot count will produce data as long as the phijlu_ready is active. If the phijlu^ready signal 
goes low the count will be stalled. 

the dot counter increments for each dot that is processed per line. It is used to determine the line finish 
position, and the bit select value for reading from the DIU buffers. The counter is reset after each line is 
processed (line Jin signal). It determines when a line is finished by comparing the dot count with the con- 
figured line size divided by 2 (note that odd numbers of dots will be rounded down). 

// define the line finish 

if <dot_cnt 1 1.4:01 =s line_sizo[15: 1) )then 

line_fin = 1 
else 

line_fin = 0 
// determine if word is velid 

dot_active = ( (llu^en == 1) AND (phi_llu_ready == 1) AND (buf^en^ 0)) 
// counter logic 
if (llu_sro_pulse == 1) then 
dot_cnt = 0 

elsif ((dot_active == 1>AND (line_fin == I)) then 

dot_cnt « 0 
elsif (detractive == 1) then 

dot_cnt « dot_cnt ♦ 1 
else 

dot^cnt = dot_cnt 
// calculate the word select bits 
bit_sel 15:01 := dot_cntt5:0] 
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The dot generator aJso maintains a read buffer pointer which is incremented each time a 64-bit word is 
processed. The pointer is used to address the correct 64-bit dot data word within the DIU buffers. The 
pointer is reset when Uu^o^ulse is 1. Unlike the dot counter the read pointer is not reset each line but 
rounded up the nearest 256-bit word This aUows for more efficient use of the DIU buffers at line finish. 
// read pointer logic 
if <llu_go_pulse »o i) then 
readUadr = 0 

elsif (( dot_active 1) AND (dot.cnc ( 5 : 0 ] = 63 ) ) then 

read.adr // normal increment 

elsiC (( dot.actxve 1) AND (line_fin »= 1 ) ) then ( 
// special end of line case 
if (dot_cntt7 5 0J J= 0) then 

read^adr(3:2) ♦+ // end of line round up 

read.adr{l;0] - 0; 

} 

31.5.5 Fifo fill level 

The LLU keeps a running total of the number of lines in the dot line store FIFO. Every time the DWU sig- 
nals a line end (dwujlujinej^r active pulse) it increments the fillleveL Conversely if the LLU detects a 
line end (line__rd pulse) ihcfilUevel is decremented and the line read is signalled to the DWU via the 
llu_dwujine^rd signal. 

The LLU fill level block is used to determine when the dot line has enough data stored before the LLU 
should begin to start reading. The LLU at page start is disabled. It waits for the DWU to write lines to the 
dot line FIFO, and for the fill level to increase. The LLU remains disabled until the fill level has reached 
the programmed threshold {fifo_read_thres). When the threshold is reached it signals the LLU to start pro- 
cessing the page by setting Humeri high. Once the LLU has started processing dot data for a page it will not 
stop if thcfiillevel falls below the threshold. 

The line fifo fill level can be read by the CPU via the PCU at any time by accessing the FifofillLevel regis- 
ter. The CPU must toggle the Go register in the LLU for Oie block to be correctly initialized at page start 
and the fifo level reset to zero. 

if (llu_go_pulse 1) then 
filllevel = 0 

elsif ((line_rd «» 1> AND (dvAi_llu_line_wr ==1)) then 

//do nothing 
elsif (line_rd 1) then 

filllevel -- 
elsif (dwu_llu_line_>rr == 1) then 

filllevel 

// determine the threshold, and set the UM going 
if (llu_go_pulse ««= 1) then 
llu_en = 0 

elsif- (filllevel == f ifo_jread_threshold ) then 
llu^en = 1 
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31.5.6 DIU interface 
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Figure 231. DIU interface 



31,5.6, 1 DIU interface descrlp^on 



The DIU interface block is responsible for determining when dot data needs to be read from DRAM keep- 
ing the dot generators supplied with data and calculating the DRAM read address based on configured 
parameters, FIFO fiU levels and position in a line. 

The fill level block enables DIU requests by activating llu^en signal. The DIU interface controUer then 
issu*^ requests to the DIU for the LLU buffers to be fUled with dot line data (or fUl the LLU buffers with 
null data without requesting DRAM access, if required). 

At page start the DIU interface determines which buffers should be filled with nuU data and which should 
request DRAM access. New requests are issued until the dot line is completely read fi-om DRAM. 
For each request to the DRAM the address generator calculates where in the DRAM the dot data should be 
read from. The color_enable bus determines which colors ai« enabled, the interfece never issues DRAM 
requests for disabled colors. 
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31.S.S.2 interface controUer 
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Figure 232. Interface controller state diagram 

The interface controller co-ordinates and issues requests for data transfers from DRAM. The state machine 
waits in Idle state until it is enabled by the LLU controller (llu_€n) and a request for data transfer is 
received from the write pointer block. 

When an active request is received {req^active equals 1) the state machine jumps to the ColorSelect state 
to detennine which colors {color^cnt) in the group need a data transfer. A group is defined as all odd col- 
ors or all even colors. If the color isn't enabled (color^enable) the count just inciements, and no data is 
transferred. If the color is enabled, the state machine takes one of two options, either a null data transfer or 
an actual data transfer from DRAM. A null data transfer writes zero data to the DIU buffer and does not 
issue a request to DRAM. 

The state machine determines if a null transfer is required by checking the color _^tart signal for that color. 
If a null transfer is required the state machine doesn't need to issue a request to the DIU and so jumps 
directly to the data transfer states {DataO to Data3). The machine clocks through the 4 states each time 
writing a null 64-bit data word to the buffer. Once complete the state machine returns to the ColorSelect 
state to determine if further transfers are required. 

If the color_start is active then a data transfer is required The state machine jumps to the Request state 
and issue a request to the DIU controller for DRAM access by setting Uujliu^rreq high. The DIU 
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responds by acknowledging the request (diujlu^ack equals 1) and then sending 4 64.bit words of data. 
The transition from Request to DataO state signals the address generator to update the address pointer 
(adr^update). The state machine clocks through DataO to Data3 states each time writing the 64-bit data 
into the buffer selected by the req^sel bus. Once complete the slate machine returns to the ColorSelect 
state to determine if further transfers are required 

When in the ColorSelect state and all data transfers for colors in that group have been serviced (i.e. when 
color^cnt is 6) the state machine will return to the Idle state. On transition it will update the word counter 
logic (word_dec) and enabled the request logic {reqjupdate). 

A reset or llu_go_j)ulse set to 1 will cause the state machine to jump directly to Idle, The controller will 
remain in Idle state until it is enabled by the LLU controller via the llujsn signal. This prevents the DIU 
attempting the fill the DIU buffers before the dot line store FIFO has filled over its threshold level. 

31.5.6.3 Color activate 

The color activate logic maintains an absolute line count indicating the line number currently being pro- 
cessed by the LLU. The counter is reset when the llu^ojmlse is 1 and incremented each time a line^rd 
pulse is received. The count value (line^cnt) is used to determine when to start reading data for a color. 

The count is implemented as follows: 
if < llu_go_pulse == L) then 

line_cnt = 0 
elsif ( line_rd 1) then 

line_cnt 

The color activate logic compares line count with the relative line value to determine when the LLU 
should start reading data ftom DRAM for a particular half color. It signals the interface controUcr block 
which colors are active for this dot line in a page (via HHt color jtart bus). It is used by the inter&ce con- 
troller to determine which DIU buffers require null data. 

Once the color_fitart bit for a color is set it cannot be cleared in the normal page processing process. The 
bits must be reset by the CPU at the end of a page by transitioning the Co bit and causing a pulse on the 
llu^go^ulse signal. 

Any color not enabled by the color^enable bus will never have its color^start bit set. 

for (i=0; i<I2;i^4^}{ 

if ( llu_9o_pulse S8 1) then 

col_on[i] a 0 
elBif ( color.enableti % 6] == 1 ) then 

col_on[i) « 0 
eleif { line_cnt eolor_rel_lineIiJ ) then 

col_onCil = 1 

) 

// select either odd or even colors 

if ( odd_even_sel «= 1 ) then // odd selected 

color_3tart(5:0I = {col_on[ll) , col_on[9J ,col_on(7J ,col.on(5) ,col_on{3) ,col_on[I) > 
®lse // even selected 

color_s tar t ( 5 : 0 ] = { col_on ( 10 ] , col_on IB}, col_on [ 6 ) . col^on ( 4 ] , col.on 12). col.on (01) 



3i,5.$.4 Address generator 

The address generator block maintains 12 pointers {color_adr[l 1 :0J) to DRAM corresponding to current 
read address in the dot line store for each half color. When a DRAM transfer occurs the address pointer is 
used first and then updated for the next transfer for the color. The pointer used is selected by the reqjsel 
bus, and the pointer update is initiated by the adrjupdate signal from the interface controller. 
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The pointer update and pointer initialization is dependent on the pointer position in a line and the line posi- 
tion in the FIFO. 

When a Uu_go^ube is received the pointers are each initialized to the corresponding base address for that 
color {color Jbase^adr), For each word that is read from DRAM the pointer is incremented If the word is 
the last word in a line (last^wd equals 1) and the last line in the fifo {fifojsnd equals 1) then the address 
pointer is re-initialized to the base address value. The pointer is incremented for all other words. 

The address is calculated as follows: 
// reset to base address 
if ( llu_go_pulse == 1) then 

color_adr(ll:0] = color_base_adr(ll : 0) [21 : 5] 
elBif < adr_update c= 1) then 

if (requsel == ^a;LL ) then 
//do nothing 

elBif ((£ifo_end == 1)AND <last_vrd == 1)) then 

color_adr(reQ„selJ « color^base^adr (reQ_8el] [21:5] 

else 

color_adr[re<i_sell // normal increment 

// select the address pointer 
llu_diu_radr » color^adr [req:.sel] 



31.5.6.5 Line pointer 

The line pointer logic counts the number of dot data lines read from DRAM for each color. The counter 
value is used to signal the fifo wrap point to the address generator logic. A separate counter is maintained 
for each color. 

The end of a line can be determined when the address is updated {adrjupdate equal 1) and the word trans- 
ferred is the last word of a line (Jast^wd equal 1), The line pointer that needs to be updated is selected by 
the req_fiel bus from the write pointer block. If the selected pointer is zero the counter is reset to the corre- 
sponding color^fo^ize value^ otherwise the counter is decremented. 

If the llu^o_pulse signal is high the counters are reset to its corresponding color Jifo^size value. When 
the counter is zero it sets the fifo^end bit to signal the address generator that the fifo has wrapped (to 
update the address pointer accordingly). 

if (llu_go_j>ulse == 1) then 

line_ptCll:0) = color_f ifo_si2e{ll :0] 
elsif < (adr_update == 1> AND (last_wd «« 1)) then { 

if (line_pt Creq^sel) == 0) 

.line_pt [req_sell a color^f ifo_size(req_sel] 

else 

line_pt I req_sel ] — 

> 

// select the correct line pointer for comparison 
fifo_end « <line_pt I lineup tl ==0) 

3i.S.B.B Write pointer 

The write pointer logic maintains the buffer write address pointers, determines when the DIU buffers need 
a data transfer and signals when the DIU buffers are empty. The write pointer determines the address in the 
DIU buffer that the data should be transfeired to. 

The write pointer logic compares the read and write pointers of each DIU buffer to determine which buff- 
ers require data to be transferred from DRAM (pend[ll:0] bus), and which buffers are empty (the 
buf^emp signals). Only enabled buifers are considered as indicated by the color_enable bus. 
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Buffers are grouped into odd and even buffers, if an odd buffer requires DRAM access the oddoend sig- 
nals will be active, if an even buffer requires DRAM access the eyenjyend signals will be active If botii 
odd and even buffer require DRAM access, the even buffers will get serviced first 

If any buffer requires a DRAM transfer, the logic will indicate to the interface controller via the req_active 
signal, with the odd_even^el signal determining which group of buffers get serviced The interface con- 
ir^ler will check the color_enable signal and issue DRAM transfers for all enabled colors in a group 
When ttie transfers are complete it tells the write pointer logic to update the request pending v^ 
reqjupdate signal, i r & 

TTie req^sel[3:0] signal tells the address generator which buffer is being serviced, it is constiucted from 
tiie odd^even^sel signal and the color^cnt[2:0] bus from the interfece controller. When data is being trans- 
ferred to DRAM the word pointer and write pointer for the corresponding buffer are imdated. The reasel 
determines which pointer should be incremented. 

The write pointer logic operates the same way regardless of whether the transfer is null or not. 

// determine which buffers need updates 
for{ iaO; i<12f if+) { 

// determine if request is active, filtered by color enable 
if ( wr^adrtij [3:21 == r4_«dr ti) f3 :21 ) 

pendCi] » 1 
else 

pend(i] - 0 
// determine if any enabled buffer is empty 

if ((wr^adrti] [3:01 rd^adr [i] [3 :01 > AND <color_enableti / 2] == 1)) then 
buf.empri] = 1 

) 

// Odd half colors (1,3,5,7,9,11), even half colors (0,2.4,6,8,10) 
odd_pend » ( pendflj | pend(3] | pendCS] | pendC?) | pend(9] | pend{ll) ) 
evenj)end = ( pendlO} | pend(23 | pendfA) 1 pend(63 | pend(81 ] pendflO) ) 
// fixed servicing order, only update \^en controller dictates so 
if (req^update ««= 1) then { 

if (even.x>end == 1) then // even always first 

odd_even^sel « 0 

r€q_active = 1 
elsif (odd_pend == 1 ) then // then check odd 

odd_even_ael » 0 

req^active = l. 

// nothing active 

odd_even_sel = 0 
req_active = 0 

} 

// selected requestor 

req_self3:0J = <color_cnt(2: OJ ,odd_even_sel} // concatentation 

The write address pointer logic consists of 12 2-bit counters and a word select pointer. The counters are 
reset when llu_go_pulse is one. The word pointer (word^tr) is common to all buffers and is used to write 
64-bit words into the DIU buffer. It is incremented when buf_rd_en is active. If the word_j>tr is 3 and the 
buf^rd^en is active the selected write pointer iwr^tr[req^elj) will be incremented A concatcnarion of 
the wnte pomter and die word pointer are use to construct the buffer write address. The write pointeis are 
not reset at the end of each line. 

// determine which pointer to update 
if Cbuf_wr_en 1) then { 

wr__adr (req^ael ] ♦+ 

var_enrreq_sel3 = I 

) 
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// determine which pointer to update 
if (llu_go^ulse == 1) then 

wr_ptr[ll:0) = 0 

word_ptr = 0 
elsif (buf_r<4_en == 1) then { 

wordLptr+^ 

if (word_ptr 3 ) then 
wr_j) t r ( r eq^se 1 ) ♦ + 

) 

// create the address from the write pointer and word pointer 
wr_adr[req_sel) = {%*rj>tr[req.sea J .wordLPtr) // concatenation 



Si 



31.5.6.7 Word count 



The word count logic maintains 2 counters to track the number of words transferred from DRAM per Une 
one coimter for odd data, and one counter for even. On receipt of a Uu_go_puUe, die counters are initial- 
ized to (be cohrjinejnc value (number of words per line). When a group of words are transferred to 
DRAM as indicated by the word_dec signal from the interface controller, the conesponding counter is 
decrcinented. Jhc counter to decrement is indicated by die odd_even_^el signal from the write pointer 
DiocK (even = 0, odd =1). 

When a counter is zero the last_wd signal for that group (i.c. odd or even) is set. The lastj^d signal indi- 
cates to the address generator that the next word tninsfened from DRAM for the corresponding color is the 
last word in die line. When the last word actually gets transferred the interface controller will pulse the 
word^dec signal causing the corresponding word count to reset to the color Jinejnc value. 

/ / determine which counter to decrement 
if (llu_go_pua8e s=a 1) then 

word_cnt(0) = color_line_inc // odd count 

word_cnttlJ = color_line_inc // even count 
elsif <word_doc « 1) then { /, need to decrement one word counter 

if <word_cntCodd_even_s€l) == 0) then // line finish 

word_cnt [odd_even«selJ = color_line_inc 

else 

wor<l_cnt I odd_even_sel ) 

> 

// select the correct the last.wd 
^*st_wd = <word_cntto<idLeven_oel3 ■» 0) 

The word count logic also determines when a complete line has been read from DRAM, it then signals the 
w°ifi»*^ " ^'^ f^'* signal) that a complete Une has been read by the 

// line finish logic 

if (llu_go_pulse == 1) then 

line_fin = 0 

line_rd =0 

elsif (daat^wd -= l) and (line^fin =» 0) AND (word^dec =« 1 ) ) then 
lxne_fin =1 // first group last^wd finish pulse 

line_rd =0 

elsif ((last_wd — 1) AND (line^fin 1) AND (word_dec == 1 ) ) then 
line.fxn =0 // second oroup last_wd finish pulse 

li.ne_rd =1 

else 

line_fin = line_fin // stay the same 

line_rd = 0 
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32 PrintHead Interface (PHI) 



32.1 Overview' 



The Printhead interface (PHI) accepts dot data from the LLU and transmits the dot data to the printhcad 
usmg the printhead interface mechanism, the PHI generates the control and timing signals necessary to 
load and drive the bi-Hthic printhead The CPU detennines the line update rate to the printhead and adjusts 
the hne sync frequency to produce the maximum print speed to account for the printhead IC's size latio 
and inherent latencies in the syncing system across multiple SoPECs. 

The PHI also needs to consider the order in which dot data is loaded in the printhead. This is dependent on 
the construction of the printhead and the relative sizes of printhead ICs used to create the printhead. See 
Bi-lithic Printhead Reference document for a complete description of printhead types [10]. 

The printing process is a real-time process. Once the printing process has started, the next Printline's data 
must be transfcmsd to the printhead before the next line sync pulse is received by the printhead. Otherwise 
the printing process will terminate with a buffer underrun error. 

The PHI can be configured to drive a single printhead IC with or without synchronization to other 
SoPECs. For example the PHI could drive a single IC printhead (i.e. a printhead constucted with one IC 
only), or dual IC printhead with one SoPEC device driving each printhead IC. 

The PHI interface provides a mechanism for the CPU to directly control the PHI interface pins, allowing 
the CPU to access the bi-lithic printhead to: 

• determine printhead temperature 

• test for and determine dead nozzles for each printhead IC 

• initialize each printhead IC 

• pre-heat each printhead IC 

Figure 233 shows a high level data flow diagram of the PHI in context. 



SoPEC 



LLU 



PHI 


Temodata ^ 


^ control 





CPU 




Bi-lfthic Printhead 
Figure 233. High level data flow diagram of PHI In context 
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32.2 Printhead modes of operation 

The printhead has 4 different modes of operations (although some modes are re-used). The mode of oper- 
ation IS defined by the state of the output pins phi_byaci. and phi_readl. As both printhead ICs are driVen 
by the same signals both printhead ICs must be in the same mode of operation. The modes of operation are 
denned in Table 158. 



Table 158. Printhead modes of operation 







mm, 










1 




Normal print mode, dot data is clocked into the print- 

head shift rAnL<ltAr An AAr*h fntliru^ AWnn a< mK7 MMfl^ 


DOrr^LOAO/ 
RREJNtT 


1 


0 


phiJrctkdQ 


Dot Load Mode, data stored in the dot shift register is 
transferred into the dot latch on the ^iing edge of 
phUsynd, and latched In on the rising edge of 
pMJsyncI 




Rre load mode. Parameter for generating fire pattern 
are loaded Into generator, data on phi_ph_<fatal1:0][0] 
is clocked Into the generator on each rising edge of 
pN^frdk 


TEST_MODE 


0 


0 


phUrctksQ 


Dot Load Mode, data stored in the dot shift register is 
transferred into the dot register on the rising edge of 
phUsynct, Identical to DOT^LOAD 


phLsrcih4> 


The printhead is in test mode, the temperature delta 
Sigma Is clocked out of the printhead on the rising of 
frdk through phi_ph^data[1:0][1] 
The result of the nozzle test is ck>cked out of the print- 
head through phi^fijclata(1:0][0J 


FIRE.GEN 


0 


1 


N/A 


The nozzle test circuit Is reset 

CMOS testing mode, the dot shift register is scanned 

out of the printhead on the falicng edge of phLsrdk. 

Data Is output on phi_j3h^<iata[1:0][1:0} 

The initialised generator creates the fire pattern and 

shm select pattern, and the pattern Is ck>cked into the 

fire shift register and select shift register on the rising 

edge of phi_frclk 



32.3 Data rate equauzation 



The LLU can generate dot data at the rate ofl2 bits per cycle, where a cycle is at the system clock fte- 
quency. In order to achieve the target print rate of 30 sheets per minute, the printhead needs to print a line 
every 100ms (calculated from 300mm @ 65.2 dots/mm divided by 2 seconds =~ lOOnsec) For a 7-3 con- 
structed prmthead this means that 9744 cycles at 106Mhz is quick enough to transfer die dot data. The 
mput FIFOs are used to de-couple the read and write clock domains as well as provide for differences 
between consume and fill rates of the PHI and LLU. 

Nominally the system clock (pclK) is run at 160Mhz and the printhead interface clock (phiclk) is at 
lOoMhz. 

If the PHI was to transfer data at the full printhead interface rate, the transfer of data to the shorter print- 
head IC would be completed sooner than the longer printhead IC. While in itself this isn't an issue it 
requires that the LLU be able to supply data at the niaximum rate for short duration, this requires uneven 
bursty access to DRAM which is undesirable. To smooth the LLU DRAM access requirements over time 
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i3 



the PHI transfers dot data to the printhead at a pre-programmed rate, proportional to the ratio of the shorter 
to longer printhead ICs. 



Without Rata equalization (7:3 head) 



h" 


100 usee 




1\ 


phljsynd |j 






LT" 


phij)h_datalOH1:0] | \ ' . \ . , ... . * 


* . * .* • • 




1 n 


phLphdataI1](1K)J . tv;: ^ v 




• 1 


n 








m m 



phL8rclk[1] 



wnh Rate equalization (7:3 head) 
phLlsynd y 



phi^h.data[0][1:0]- 
phi_ph_data(1I[1:0] 

phL8rcUc[0] 
phi_8fclk[tl' 



1 



Lr~ 
□ □ 



n 



Figure 235. Printhead data rate equalization 

The printhead data rate equalization is coatroUed by PrintHeadRate[l:0] registers (one per printhead IC). 
The register is a 16 bit bitmap of active clock cycles in a 16 clock cycle window. For exanq>le if the regis- 
ter is set to OxFFFF then the output rate to the printhead will be full rate, if it's set to OxFOFO then the out- 
put rate is 50% where thore is 4 active cycles followed by 4 inactive cycles and so on. If the register was 
set to 0x0000 the rate would be 0%. The relative data transfer rate of the printhead can be varied from 0- 
100% with a granidarity of I/l 6 steps. 



Table 159. Example rate equalizatfon values for common printheads 









a:2 


OxITFF (100%) 


Oxini (25%) 


7:3 


OxFFFF (100%) 


0x5551 (43.7%) 


6:4 


OxFFFF (100%) 


0xFlF2(68.7%) 


5:5 


OxFFFF (100%) 


OxFFFF (100%) 



If both printhead ICs are the same size (e.g. a 5:5 printhead) it may be desirable to reduce the data rate to 
both printhead ICs, to reduce the read bandwidth from the DRAM. 
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32.4 Dot generate and transmit order 

Several printhead types and arrangements exists (see Section 3$ Memjet Printhead) . The PHf is capable of 
driving all possible configurations, but for the purposes of simplicity only one arrangement (arrangement 0 
• see Section 35 Memjet Printhead) is described in the following examples. 



OotTmnsmit 
Order ~ 



Q i o o o- 



■ o o o 



o o o q: 



t 3 5 



o o o 



m-S a»-3 nfl 




o o o - 



o o o - 



n^) bk44 



■ o o o o 



n-6 n*4 ti-l 



■o o o o 



«-5 n-J n.1 





Type 0 printhead IC 


Type 1 printhead IC 






Paper 


M - Mkfway point In dots 




N - Number of dots in a line 


Note: Rsper passing under prfnth«aci 



I' 



5 Unas 



Figure 236. Printhead structure and dot generate order 

The structure of the printhead ICs dictate the dot transmit order to each printhead IC. The PHI accepts two 
streams of dot data from the LLV, one even stream the other odd. The PHI constructs the dot transmit 
order streams from the dot generate order received from the LLU. Each stream of data has already been 
arranged in increasing or decreasing dot order sense by the DWU. The exact sense choice is dependent on 
the type of printhead ICs used to construct the printhead, but regardless of configuration the odd and even 
stream should be of opposing sense. 

The dot transmit order is shown in Figure 236, Dot data is shifted into the printhead in the direction of the 
arrow, so from the diagram (taking the type 0 printhead IC) even dot data is transferred in increasing order 
to the mid point first (0, 2. 4, .... m-6, m-4, m-2). then odd dot data in decreasing order is transferred (m-l, 
m-3, m-5,...., 5, 3, 1). For the type 1 printhead IC the order is reversed, with odd dots in increasing order 
transmitted first, followed by even dot data in decreasing order Note for any given color the odd and even 
dot data transferred to the printhead ICs are from different dot lines, in the example in the diagram they are 
separated by 5 dot lines. Table 160 shows the transmit dot order for some common A4 printheads. Differ- 
ent type printheads may have the sense reversed and may have an odd before even transmit order or vice 
versa. 



Tabfe 160. Example printhead ICs, and dot data transmit order for A4 (13S24 dots) page 









TypeO 


Printhead IC 


8 


11160 


0.2.4,8 ^574^576,5578 


5579,5577.5575 .7,5.3.1 


7 


9744 


0^,4.8.....,4866.4868.4a70 


4871 .4869,4867 7,5.3.1 


6 


6328 


0,2.4.8 4158,4160.4162 


4163,4161,4159 7.5.3,1 


5 


6912 


0.2.4.8 3450,3452.3454 


3455,3453,3451 .7.5.3.1 


4 


54S6 


0.2,4,8 .2742.2744.2746 


2847.2845.2843 7.5.3.1 


3 


4080 


0.2.4.8 .2034.2036,2038 


2039.2037.2035 7.5,3.1 
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2 


! 2664 


0,2,4,8......1326,1328,1330 


1331,1329.1327... 


...7.53.1 




Printhead IC 


8 


11160 


13823,13821.13819. 




1332,1334.1336... 


....13818,13820.13822 


7 


9744 


13823.13821,13819. 




2040.2042.2044... 


,,..13818,13820,13822 


6 


6328 


13823.13821.13819. 




2848,2850.2852... 


....13818,13820.13822 


5 


6912 


13823,13821.13819 


«...3461. 3459,3457 


3456,3458.3460... 


....13818.13820.13822 


4 


5496 


13823,13621.13819. 




4164.4166.4168... 


,...13818,13820.13822 


3 


4080 


13823.13821,13819. 




4872,4874.4876... 


....13818,13820.13822 


2 


2664 


13823.13821.13819 . 




5560.5562.5584... 


....13818,13820,13622 



32.4.1 Dual Printhead IC 

Generate dot order (from the LLU) 



Odd Dot stream 
Even Dot stieani 



Transmit dot order(to the printhead) 



6912 dock cydes 

Mid 
Point 



Printhead Channet A R!; 



















mm 


< — 








< — 


487d dodc cycles 


** 2040 clock cyUes " 


► 






8744 dock cycles 


► 



^ Even dots from Line Y 
Odd dots from UneY-5 



Example: Une with 13624 dots, wtth 7-.3 printhead 
Figure 237. Dot data generated and transmitted order 



The LLU contains 2 dot generator units. Each dot generator reads dot data from DRAM and generates a 
stream of dots in increasing or decreasing order. A dot generator can be configured to produce odd or even 
dot data streams, and the dot sense is also configurable. In Figure 237 the odd dot generator is configured 
to produce odd dot data in decreasing order and the even dot generator produces dot data in increasing 
order. 

In order to reconstruct the dot data streams from the generate order to the transmit order, the connection 
between the generators and transmitters needs to be switched at the mid point. At line start the odd dot 
generator feeds the type 1 printhead, and the even dot generator feeds the type 0 printhead. This continues 
until both printheads have received half the number of dots they require (defined as the mid point). The 
mid point is calculated firom the configured printhead size registers iPrintHeadSize), Once both printheads 
have reached the mid point, the PHI switches the connections between the dot generators and the print- 
head, so now the odd dot generator feeds the type 0 printhead and the even dot generator feeds the type 1 
printhead. This continues until the end of the line. 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



^9 Nov 2002 
Page 503 



SoPEC : Hardware Design 



It is possible that both printheads will not be the same size and as a result one dot generator may reach the 
mid point before the other. In such cases the quicker dot generator is stalled until both dot genexators reach 
the mid point, the connections are switched and both dot generators are restarted. 

Note that in the example shown in Figure 237 the dot generators could generate an A4 line of data in 6912 
cycles, but because of the mismatch in the printhead IC sizes the transmit time takes 9744 cycles. 

I 32.4.2 Single printhead IC 

In some cases only one printhead IC may be connected to the PHI. In Figure 238 the dot generate and 
I transmit order is shown for a single IC printhead of 9744 dots width. While the example shows the print- 

head IC connected to channel A, either channel could be used. The LLU generates odd and even dot 
streams as normal, it has no knowledge of the physical printhead configuration. The PHI is configured 
1 with the printhead size (PnntHeadSizeflJ register) for channel B set to zero and channel A is set to 9744. 



Generate dot order (from the LLU) 



Odd Dot stream |^^^ 
Even Dot stream 



4672 dock cydaa 



Transmit dot order(to the printhead) 



Printhead Channel A Kyjjl^^^TSglig^^^ 



Printhead Channel B 



^ 4572ck)ckcydes 4^5 c l ocfc cydes 

^ 9744 dock cydes 



llil Even dots from UneY 

i§ Odd dots from UneY-5 Example: Une with 9744 dots, with 7:0 printhead 

Figure 238. Dot data generated and transmitted order (single printhead case) 

Note that in the example shown in Figure 238 the dot generators could generate an 7 inch line of data in 
4872 cycles, but because the printhead is using one IC, the transmit dme takes 9744 cycles, the same speed 
as an A4 line with a 7:3 printhead. 



32.4.3 Summary of generate and transmit order requirements 

In order to support all the possible printhead arrangements, the PHI (in conjuction with the LLU/DWU) 
must be capable of re-ordering the bits according to the following criteria: 

• Be able to output the even or odd plane first 

• Be able to output even and odd planes independently. 

• Be able to reverse the sequence in which the color planes of a single dot are output to the printhead. 
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32.5 Print sequence 

The PHI is responsible for accepting dot data streams from the LLU, restmcturing the dot data sequence 
and transferring the dot data to each printhead within a line time (i.c before the next line sync). 

Before a page can be printed the printhead ICs must be initialized. The exact initialization sequence is con- 
figuration dependent, but will involve the fire pattern generation initialization and other optional steps. The 
initialization sequence is inq>lemented in software. 

Once the first line of data has been transferred to tfie printhead, the PHI will interrupt the CPU by asserting 
thcphijcu^rint_rdy signal. The interrupt can be optionally masked in the ICU and the CPU can poll the 
signal via the PCU or the ICU. The CPU must wait for a print ready signal in all printing SoPECs before 
starting printing. 

Once the CPU in the PrintMaster SoPEC is satisfied that printing should start, it triggers the LincSync- 
Master SoPEC by writing to the PrintStart register of all printing SoPECs. The transition of the PrintStart 
register in the LincSyncMastcr SoPEC wiU trigger the start of Isyncl pulse generation. The PrintMaster 
and UneSyncMaster SoPEC arc not necessarily the same device, but often are the same. For a more in 
depth definition see section 12.3 Multi-SoPEC systems on page 104. 

Writing to the PrintStart register generates a pulse which is used to generate the line sync in the LineSyn- 
cMaster which is in turn used to align all SoPECs in a multi-SoPEC system. All prindiead signaling is 
aligned to the line sync. The PrintStart is only used to align the first line sync in a page. 

When a SoPEC receives a line sync pulse it means tfiat the line previously transferred to the printhead is 
now printing, so the PHI can begin to transfer the next line of data to the printhead. When the transfer is 
complete the PHI will wait for the next line sync pulse before repeating the cycle. If a line sync arrives 
before a complete line is transferred to the printhead (i.e. a buffer error) the PHI generates a buffer under- 
run interrupt, and halts the block. 

For each line in a page the PHI must transfer a full line of data to the printhead before the next line sync is 
generated or received. 

32.5.1 Sync pulse control 

If the PHI is configured as the LineSyncMaster SoPEC it will start generating line sync signals LsyncPre 
number of phicik cycles after PrintStart register rising transition is detected All other signals in ttie PHI 
interface are referenced firom the falling edge of phi Jlsyncl signal. 

If the SoPEC is in line sync slave mode it will receive a line sync pulse from the LineSyncMaster SoPEC 
through ih&phijsyncl pin which will be programmed into input mode. The phijsyncl input pin is treated 
as an asynchronous input and is passed through a de-glitch circuit of programmable de-glitch duration 

{LsyncDeglitchCnt), 

The phijsynel will remain low for LsyncLow cycles, and then high for LsyncHigh cycles. The phijsyncl 
profile is repeated until the page is complete. The period of the phijsyncl is given by LsyncLow + Lsyn- 
cHigh cycles. Note that the LsyncPre value is only used to vary the time between the generation of the first 
phijsyncl and the PageStart indication firom the CPU. See Figure 239 for reference diagram. 

If the SoPEC device is in line sync slave mode, the LsyncMinPeriod register specifics the minimum 
allowed phijsyncl period. Any phijsyncl pulses received before the LsyncMinPeriod has expired will 
trigger a buffer underrun error. 

32.5.2 Shift register signal control 

Once the PHI receives the line sync pulse, the sequence of data transfer to the printhead begins. All PHI 
control signals are specified firom the falling edge of the line sync. 
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The phi_srcik (and consequently phi^hjdata) is controlled by the SrvlkPre, SrclkPost registers. The 
SrclkPre specifies the number of phiclk cycles to wait before beginning to transfer data to the printhead. 
Once data transfer has started, the profile of the phi_srclk is controlled by PrintHeadRate register and the 
status of the PHI input FIFO. For example it is possible that the input FIFO could em^ty and no data 
would be transferred to the printhead while the PHI was waiting. After all the data for a printhead is trans- 
ferred to the PHI, it counts SrclkPost number of phiclk cycles. If a new phijsyncl falling edge arrives 
before the count is complete the PHI will generate a buffer underrun interrupt (phijcujunderrun). 



32.5.3 Firing sequence signal control 



MilScait Edge 



The profile of the phijrclk pulses per line is determined by 4 registers FrclkPre, FrclkLaw, FrclkHigh, 
FrclkNum, The FrclkPre register specifies the number of cycles between line sync felling edge and the 
phijrclk pulse high. It remains high for FrclkHigh cycles and then low for FrclkLow cycles. The number 
of pulses generated per line is determined by FrclkNum register. 

lh& phi^rofile pin is specified in a similar manner by the ProfiLePre^ ProfileLovf^ ProfileHigh^ PrvfileNum 
registers. 

The phijrclk period and the phi_profile period should be programmed the same, so FrclkHigh + FrclkLow 
should equal the ProfileHigh + PmfileLow, and the number of cycles for each in a line time should also be 
equal i.e. FrclkNum » ProfileNum, 

The total number of cycles required to complete a firing sequence should be less than the phijsyncl period 
i.e. {(ProfileHigh + ProfileLow) * ProfileNum)-^ ProfilePre < (LsyncLow + LsyncHigh). 

^ LsyncPro 

r i 



LsyncPeriod 



phijsynd 



LsyncHlflh 



phLsrctk 



phLph.data 



phf.frc(k 



phLprofile. 



.SrctkPfe 



^ SfdkPost ^ 



FrdkLow 



J 



ProfitePre 



ProffleHtoh 



PnrfiteLow 
1^ 



I T i L_J 



Figure 239. Printhead interface timing parameters 



Figure 239 details the timing parameters controlling the PHI. All timing parameters are measured in num- 
ber of phiclk cycles. 



32.5.4 Page complete 

The PHI counts the number of lines processed through the interface. The line count is initialised to the 
PageLenLine and decrements each time a line is processed. When the line count is zero it pulses the 
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phijcujpagejintsk signal. A pulse on the phijcu_pagejinish automatically resets the PHI Go register, 
and can optionally cause an interrupt to the CPU. Should the page terminate abnormally, i.e. a buffer 
underrun, the Go register will be reset and an interrupt generated. 

32.5.5 Line sync interrupt 

The PHI will generate an interrupt to the CPU after a predefined number of line syncs have occured. The 
number of line syncs to count is configured by the LineSyncInterrupt register. The intenupt can be dis- 
abled by setting the register to zero. 



32.6 Dot line margin 

The PHI block allows the generation of margins either side of the received page from the LLU block. This 
allows the page width used within PEP blocks to differ from the physical printhead size. 

This allows SoPEC to store data for a page minus the margins, resulting in less storage requirements in the 
shared DRAM and reduced memory bandwidth requirements. The difference between the dot data line 
size and the line length generated by the PHI is the dot line margin length. There are two margins specified 
' for any sheet, a margin per printhead IC side. 

The margin value is set by programming the DotMargin register per printhead IC. It should be noted that 
the DotMargin register represents half the width of the actual margin (either left or right margin depending 
on paper flow direction). For example, if the margin in dots is 1 inch (1600 dots), then DotMargin should 
be set to 800. The reason for this is that the PHI only supports margin creation cases 1 and 3 described 
below. 
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See example in Figure 240. 



Margin 

(200 dots) Print area(4772 dots) 

N H4 




4\ 



Paper 
Direction 



Casal 

aUdata- 

Caso2 

UUdata. 
phCsrcOc" 

Cased 

LLUdata- 
phUBTdk' 



Isynd LP 



1 1 



9544 dots 



■>H: 



100d(i6 



"IJ— 



Figure 240. Printhead timing with margining 



In the example the maxgm for the type 0 printhead IC is set at 100 dots iDotMar^n=lOO), implying an 
actual margin of 200 dots. 

If case one is used the PHI takes a total of 9744 phijsrclk cycles to load the dot data into the type 0 print- 
head It also requires 9744 dots of data from the LLU which in turn gets read from the DRAM, In this case 
the first 1 00 and last 1 00 dots would be zero but are processed though the SoPEC system consuming mem- 
ory and DRAM bandwidth at each step. 

In case 2 the LLU no longer generates the margin dots, the PHI generates the zeroed out dots for the mar- 
gining. The phijsrclk still needs to toggle 9744 times per line, although the LLU only needs to generate 
9544 dots giving the reduction in DRAM storage and associated bandwidth. The case 2 senario is not sup- 
ported by the PHI because the same effect can be supported by means of case 1 and case 3. 

If case 3 is used the benefits of case 2 are achieved, but the phijsrclk no longer needs to toggle the full 
9744 clock cycles. The phi_^rclk cycles count can be reduced by the margin amotmt (in this case 9744- 
100==9644 dots), and due to the reduction in phijsrclk cycles the phijsyncl period could also be reduced, 
increasing the line processing rate and consequently increasing print speed Case 3 works by shifting the 
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odd (or even) dots of a margin from line Y to become the even (or odd) dots of the margin Y-4, (Y-5 
adjusted due to being printed one line later). This works for all lines widi the exception of the fir^ line 
where there has been no previous line to generate the zeroed out margin. This situation is handled by add- 
ing the line reset sequence to the printhead initialization procedure, and is repeated between pages of a 
document See section 32.8.3 on page 512, 

32.7 Dot counter 

For each color the PHI keeps a dot usage count for each of the color planes (called AccumDotCount), If a 
dot is used in particular color plane the corresponding counter is incremented- Each counter is 32 bits wide 
and saturates if not reset. A write to the DotCountSnap register causes the AccumDotCount [N] values to 
be transferred to the DotCount[N] registers (where N is 5 to 0, one per color). The AccumDotCount regis- 
ters are cleared on value transfer. 

The DotCount[N] registers can be written to or read from by the CPU at any time. On reset the counters 
are reset to zero. 

The dot counter only count dots that are passed from the LLU through ther PHI to the printhead. Any dots 
generated by direct CPU control of the PHI pins will not be counted. 

32.8 CPU lO CONTROL 

The PHI interface provides a mechanism for the CPU to directly control the PHI interface pins, allowing 
the CPU to access the bi-Iithic printhead: 

• Detennine printhead temperature 

• Test for and determine dead nozzles for each printhead IC 

• Printhead IC initialization 

• Printhead pre-heat function 

The CPU can gain direct control of the printhead interface connections by setting the PrintHeadCpuCtrl 
register to one. Once enabled the printhead bits arc driven direcUy by the PrintHeadQjuOut control regis- 
ter, where the values in the register are reflected directly on tiie printhead pins and the status of the print- 
head input pins can be read directly from the PrintHeadCpuln, The direction of pins is controlled by 
programming PrintHeadCpuDir register. The register to pin mapping is as follows: 



Table 161. CPU control and status registers mapping to printhead Interface 





®3 






PrintHeadCpuOut 


IK) 


phi_ph_data_oI0][1 :0] 






3:2 


phi^h_data_o[1 ][1 :0] 




4 


phLlsyncl_o 




5 


phLreadI 




7.-6 


phLsrdktlK)] 




8 


phLfrdk 




9 


phij)rofile 
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Table 161. CPU conttx>l and status registers mapping to printhead Interface 









PrintHeadCpuDir 


1:0 


phij>h_data_e(01I1K)] direction control. 
1 - output mode 
0- input nx)de 




3:2 


phl_ph_data_e(lj[1.-01 direction control 
1 > output mode 
0 • input mode 




4 


pftl_isyncl.e direction control 
1 ' ou^njt mode 
0 - input mode 


PilntHeadCpuIn 


IK) 


phij>h_data„i[Plt1K>J 




3:2 


phl^h_dataJ[tII1KJl 




4 


phLlsynciJ 



— w wMv^ tu j^rwu^cuuK^pux^in moac u is me lesponsiDUity oi tbe CPU to drive the 

printhead correctly and not create situations where the printhead could be destroyed such as activating all 
nozzles together. 

Note the foUowiug procedures are based on current printhead cqjabilities, and are subject to change. 

32.8.1 Dead nozzle mformatlon capture 

The CPU (via the direct printhead control mechanism) has the capability of testing each of the nozzles in 
the prmthead and determining which nozzles are dead, the resultant dead nozzle information is processed 
by the CPU to generate the dead nozzle table used by the DNC. 

32.8. If Nozzle test procedure 

The nozzle test software must first initialize the fire pattern generator for each printhead IC as normal, then 
it must initialize the fire pattern register as normal. The fire pattern generator parameters must be chosen 
so as to create a fire pattern where only one nozzle is firing at a time. 

For example if the printhead is constructed with a 7:3 configuration where the left printhead is 7 inches 
and the right 3 inches. The fire paltem length is equal to the number of dots in a half line (NLEN=n- 
1, where n = 9744 / 2 = 4872), the COUNT=l and B=0. The fire generator in the printhead needs to be ini- 
tialized with NLEN=4871, COUNT»l, B=0. See Section 32.8.4 for exact details on how to program the 
fire pattern generator. 

Once the generator is setup the nozzle test software puts the printhead into FIRE^GEN mode and the fire 
pattern is loaded into the fire shift registers. 

The next step is to load the dot data shift registers with a test pattern. Any test pattern could be used it 
should be chosen so as to allow only one color to fire at a time. Once the printhead shift registers are ini- 
tialized the software can begin the nozzle test sequence. 

The printhead is put in FIRE^GEN mode which resets the test circuit, both phi ^rclk and phi Jrclk are held 
mactive. After a pre-determined time the printhead is put in TEST J^ODE where the nozzle is tested. 
The test software toggles phi^rofde output pin and then samples the test result on the phi^>h_data pin. 
The test software then generates one phi Jrclk pulse to advance the fire pattern and repeats the profile 
pulse and test result capture as before. This procedure is repeated for all dots in the half dot line. Once the 
test result for a particular dot line is complete the whole procedure is repeated 1 2 times once for each half 
dot line. 
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The dead nozzle software collates all the nozzles test results and produces the dead nozzle table for use by 
the DNC. 



^ RRE INPT ^ ^ RRE GgN ^ ^ NORMAL ^ ^RE GEN ^ fBST MOD^ ^flE_GEN ^ ^ST_MOgE 



pW_Isynd_ 
phLreadI 



pM.jsnA 


















_JL_ 


ptiLproOa 






n 


• ■ 



ptiLphjdataCOL 



fire Init data 



Test pattarti Data 



Nozzle test resiit 



Test Repeated Nozzle times 



Figure 241. Nozzle Test Modes & Setup 



32.8.2 Temperature capture 



Occasionally die CPU will need to sample the printhead temperature and possibly adjust the firing profile 
based on the result. 

To capture the printhead temperature, the printhead must be put into TEST^MODE, and the 
phi^)h_dataj pin input mode. The CPU will toggle the phi Jrclk and then sample the phi _j}h_dataj to 
capture the temperature data. The cycle is repeated N times, and the N bits of data are used to generate the 
printhead temperature value. The temperature capture waveform is shown in Figure 242, 

The exact number of bits required (i.e. N) and the temperature value generation mechanism is currently 
imdefined. 



pWJsynd ' 
phLreadI ' 



TEST_MODE 



prti.frelk 



ClockO Clock 1 

J L_l 



Clock N 

.J~~L 



phl^h^dataj[1) InvalM \ DataO Qata 1 j " " II " X " " " ^ \ OataN 7 InvaBd 
phLsrclk 



yNphictkCXM. 
Cycles 



Figure 242. Temperature Capture Waveform 
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32.8.3 Printhead initialization procedure 

In order to use the printhead for the first time the CPU must download parameters for contiollmg the fire 
pattern generator. The download is performed by entering the FIRE^INIT mode and data is transferred 
^ough the phi^h_jiataf I :OJfOJ pins (one pin per printhead IC) and clocked into the printhead on the ris- 
ing edge of phi Jrclk, In total 29 clock cycles are required to transfer the full set of parameters. 



Table 162. Parameters for Rre Pattern Initialization 









NLEN 


14 


Fire pattern length. Values defines the length of the fire pat- 
tern, NLEN=N-1 where N Is the pattern length. 


COUNT 


14 


Defines the remaining number of dock cydes required to 
generate the Rre Pattern. Is given by COUNT= (L^ /2) Mod 
N -1 where La is the dot length of longer printhead or 
COUNT= (La - L^ -((Lb /2) mod N)) Mod N -1 for the Shorter 
printhead 


B 


1 


Select shift register inversion bit. 



Once the generator is initialized the fire pattern and select pattern need to be created and shifted into their 
respective shift registers. The priiitheads are put into FIRE^GEN mode and the phi Jrclk is toggled 
times, where is the length of the longer printhead in dots. As phi^lk is a common signal for both 
printheads it means that if the prindiead ICs are of different length one printhead IC will get clocked too 
many times by phi Jrclk. The fire pattern generator internal in each printhead IC takes account of this. See 
Section 32.8.4 Fire pattern generator 

If dot line marginin g is to be used the dot data registers in the margining region in the printhead IC need to 
be initialized to zero before any line is printed. See section 32.6 on page 507 for a full explanation of dot 
line margin setup. The CPU does this by entering NORMAL_MODE and fills the dot data shift register 
with zeros. This is performed by clocking the phi^srvlk to each printhead dot margin times for the each 
printhead IC. As phi^srclk is not common to both printhead ICs the number of clock Qrcles can be differed 
to each printhead IC. 

Once the printhead initialization is complete control of the printhead can be released to the PHI to allow 
printing to begin. 

32.8.4 Fir© pattern generator 

The fire pattern generator is logic within each printhead IC used to generate the fire pattern and the select 
shift pattern. The fire pattern generator must be initialized by the SoPEC device before a page can be 
printed. The SoPEC uses the CPU direct lO control of the prindiead pins to download the initialization 
parameters and generate the initialization sequence. 
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32.9 Implementation 

32.9.1 Definitions of I/O 



Table 163. Printhead interface I/O definition 





lis 






Clocks and Resets 




pdk 




in 


System Clock 


phidk 




In 


Printhead intertaoe dock i<k)cfkf3) used to transfer data from pdkto 
doof/r domains 


dodk 




In 


Data out dock (2x pdki used to transfer data to printhead 


preLn 




In 


System reset, synchronous active low. Synchronous to pdk 


phlrst^n 




In 


System reset, synchronous acth/e low. Synchronous to phidk 


dor8t_n 




In 


System reset, synchronous active tow, -Synchronous to doc/k 


General 


phljcuj)rint.rdy 




Out 


indicates that the first line of data is transferred to the printhead 
Active high. 


phi_lcu_page_finish 




Out 


Indicates that data for a complete page has transfen-ed. Active high 


phijcu^underrun 




Out 


indicates the PHi has detected a tjuffer underrun. Active high 


phLlcuJfnesync_int 




Out 


Indicates the PHI has detected UneSync/nfem^p/ number of line 
syncs. 


Debug 


debug_data_outl2X)l 


3 


In 


Output debug data to be muxed on to the PHI pins 


debug_cntrf[2.*0} 


3 


In 


Control signal for each PHI bound detnig data Hne Indicating 
whether or not the debug data should be selected l>y the pin mux 


LLUInterteee 


lluj)hLdataC1:0][5:0] 


2x6 


Out 


Dot Data from LUJ to the PHI, each bit is a ootor plane 5 downto 0. 
Bus 0 - Even dot data stream 
Bus 1 • Odd dot data stream 

Data is active when corresponding bit Is active in llujphLBvs^biSA 


phUlu_ready[1:0] 


2 




Indicates that PHI is ready to accept data from the U.U 

0 - £ven dot data stream 

1 • Odd dot data stream 


ltu_phLavaitll:01 


2 


Out 


Indicates valid data present on corresponding liu_j)hL<isita. 

0 - Even dot data stream 

1 - Odd dot data stream 


Printhead interface 


phi j)h_data_l(l :0I(1 :0] 


2x2 


In 


Dot data input from printhead. 
Bus 0 • Printhead channel A 
Bus 1 - Printhead channel B 


phi _ph_data_ol1 :0J(1 :01 


2x2 


Out 


Dot data output to printhead. Each bus to each printhead contains 2 
bits of data 

Bus 0 - Printhead channel A 
Bus 1 - Printhead channel B 


phi_ph_data_e(1 :0](1 :0J 


2x2 


Out 


Ooi data directton control. Pin is driving when high 
Bus 0 - Printhead channel A 
Bus 1 - Printhead channel B 


phi_srcfl<(1:0J 


2 


Out 


Dot data shift dock used to dock in printhead data 
Bus 0 - Printhead channel A 
Bus 1 - Printhead channel S 
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Table 163. Prlnthead Interface I/O deflnitlon 









phLreadt 


1 


Out 


Common printhead mode control. Used in conjunction with 
phUsyn(^Xo deternrtine the prlnthead mode 

0 - SoP£C receiving » printhead driving 

1 • SoPEC driving, printhead receiving 


phMrcik 


1 


Out 


Common Rre pattern dock needs to toggle once per fire cyde 


phi_profile 


1 


Out 


Common pulse proftJe for all colors 


phij8ynd_o 


1 


Out 


Capture dot data for next print line, output mode 


phijsynd^e 


1 


In 


p/i/L/sync/ output enable, when high phijsyncl pin is driving 


phijsynd^i 


1 


In 


Une Sync Pulse from Master SoPEC 


PCU Interface 


pcuj)hLsel 


1 


In 


Block select from the PCU- When pcu _phLse/ls high both pcu^a<Sr 
and pcu_dataoutax9 valkJ. 


pcu_rwn 


1 


In 


Common read/not-write signal from the PCU. 


pcu_adr(7:2) 


6 


In 


PCU address bus. Only 6 bits are required to decode the address 
space for this block. 


pcu_dataout(3l:0] 


32 


In 


Shared write data bus from the PCU. 


phl_pcu_fdy 


1 


Out i 


Ready signal to the PCU. When pN_f)cu_txiy\& high it indfoates the 
last cyde of the access. For a write cyde this means pca_dafiaoiif 
has been registered by the block and for a read cyde thlsmeans 
the data on phf_pcu_data Is valid. 


phi_pGUjclata(31K)J 


32 


Out ; 


Read data bus to the PCU. 



Doc: SoPEC_hardware_design 
Version: 2.3 



S3 Proprietary Document 



JS9 Nov 2002 
Page 514 



SoPEC : Hardware Design 



J3 



32.9.2 PHI sub-block partition 



Line Loader Unit (LLU) 



<Sat)uo.cntit- 




J^^ J pdk domain (160 Mhi) j dodk domain (320 Mhr) t phlcflc domain (106 Mhz) 

Figure 243. PHI block partition 



32.9.3 Configuration registers 



Doc: SoPEC_har<iware_design 
Version: 2.3 



S3 Proprietary Document 



29 Nov 2002 
Page 515 




SoPEC : Hardware Design 



The configuration registers in the PHI are programmed via the PCU mterface. Refer to section 21.8.2 on 
page 257 for a description of the protocol and timing diagrams for reading and writing rasters in the PHI. 
Note that since addresses in SoPEC are byte aligned and the PCU only supports 32-bit register reads and 
writes, the lower 2 bits of the PCU address bus are not required to decode the address space for the PHI. 
When reading a register that is less than 32 bits wide zeros should be returned on the upper unused bit(s) 
of phi^pcujtata. Table 164 lists the configuration registers in the PHI 



Table 164. PHI registers description 











Control Registers 




0x00 


Reset 


1 


0x1 


Active low synchronous reset, self de-activating. A 
write to this register wfO cause a PHJ block reset 


0x04 


Go 


1 


0x0 


Active high bit Indicating the PHI is programmed 
and ready to use. A low to high transition will cause 
PHI btock Internal state to reset. Will be automatic 
cally reset if a page finish or a buffer underrun is 
detected. 


General Com 


trol 


0x08 


PageLenUne 


32 


0x0000 
_0000 


Specifies the number of dot lines in a page. 


OxOc 


PrintStart 


1 


0x0 


A low to h^ transition triggers printing to start 
Only active in Master Mode 


0x10-0x14 


DotMargIn 


2x16 


0x0000 


Specifies for each printhead 10, the %vknh of the 
margin In dots divided by 2. 

0 • Printhead IC Channel A 

1 - Printhead IC Channel B 


0x18-0x20 


DotCount(5:0] 


6x32 


0x0000 
.0000 


Indicates the number of Dots used for a peuHcular 
color, where N specifies a color from 0 to 5. Value 
valid after a write access to DotComtSnap 


0x30 


DoiCounlSnap 


1 


0x0 


Write access causes the >4ca//nOofCoeinf values to 
l>e transferred to the Oo/Counf registers. The 
AccumDotCount aro reset afterwards. 


0X34 


PhiHeadSwap 


1 


0x0 


Controls which signals are connected to printhead 
channels A and B 

0 - Normal, spedfies bit 0 is channel A, bit 1 1s 
channel B 

1 - Swapped, specifies bit 0 Is channel B. bit 1 is 
chanitel A. 


0x38 


PhiMode 


1 


0x0 


Indicates whether the PHI Is operating in master or 
stave mode 

0 • Slave Mode 

1 • Master Mode 


0x3C-0x40 


PhiSerialOrder 


2x1 


0x0 


Specifies the seriaiization order of dots belbre 

transfer to the printhead. 

Bus 0 - Printhead Channel A 

Bus 1 • Printhead Channel B 

A 0 indicates order ABC, while 1 indicates CBA 
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Table 164. PHI registers description 




0x44-0x48 PrintHeadSize 



2x16 




0x0000 



Specifies the number of non-margin dots in the 
printhead ICs. If margining is to be used then the 
configured P/fnU^adSize should be a<3[justed by the 
dot margin value i.e. PrintHeadSize a {Phyttoal- 
PrintHeadSize - (DotMargin • 2)). 
Bus 0 - Specifies prfnthead on Channel A 
Bust 



CPU Direct P 


HI Control (See Table 161 








0x4C 


PrtntHeadCpuIn 


5 


0x00 


PHI InterfSace pins Input status. Only active in direct 
CPU mode 


OxSO 


PrintHeadCpuDir 


5 


0x00 


PHI Intertace pins direction control. Only active in 
direct CPU mode 


0x54 


PrintHeadCpuOut 


10 


0x000 


PHI interface pins output control. Only active (n 
cfirect CPU mode 


0x58 


PrIntHeadCpuCtrl 


1 


0x0 


Control direct access CPU access to the PHI pins 

0 - Normal Mode 

1 - Direct CPU Control mode 


Line Syne Cc 


tniroi 


OxSC 


LsyncLow 


16 


0x0000 


Number of p/ifcfA: cycles p/iL^^syncf should remain 
low. 


0x60 


l.syncHlgh 


16 


0x0000 


Number of ptUdk cycles phL^ynd should remain 
high. 


0x64 


LsyncPre 


16 


0x0000 


Number of phictk cydtes between PrintStart rising 
transition arnJ the generated p/rL/synof falling edge 


0x68 


LsyncMin Period 


24 


0x00 J) 
000 


Minimum number of phiciiccyd^e& between L^ync 
pulses. Lsync pulses of a shorter period will be 
rejected. Only used In stave mode. 


OxSC 


LsyncOeglHchCnt 


4 


0x3 


Number of phtdk cydes to filter the incoming Lsync 
pulse from the master. Only used in slave nrxxje. 


0X70 


UneSyncfntemipt 


16 


0x0000 


Number of One syncs to occur before generating an 
Interrupt. When set lo zero intemjpt Is disabled. 


Shift Register Control 


0x74 


SrdlcPre 


14 


0x0000 


Number of phidk cydes between p/iLfe>7icf falling 
edge and pN^srdk pulse generation, or printhead 
data transfer 


0x78 


Srctlcf\)st 


14 


0x0000 


Number of pNclk cydes allowed margin from last 
srcf^ pulse in a line to before next line sync 


0x7C-0x80 


PrintHeadRate(1.-0} 


2x16 


OxFFFF 


Specifies the active to Inactive ratio of ptii^srdk for 
the printfiead ICs. A 1 indicates Active. 
Bus 0 • Printhead IC channel A 
Bus 1 - Printhead IC channel B 


0x84 


DotOrdefMode 


1 


0x0 


Specifies the dot transmit order to the printhead 
Channel A. Printhead Channel B is always the 
opposing order. 

0 - Even before Odd dots 

1 - Odd before Even dots 


Fire Control 


0x88 


ProfilePre 


14 


0x0000 


Number of pn/c//c cydes phLisynd failing edge and 
phLprofite pulse generation 
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Table 164. PHI registers description 













Ox8C 


ProfiJetow 


14 


0x0000 


Number of p/i/cMc cycles phLprofite should remain 
low. 


0x90 


ProfileHigh 


14 


0x0000 


Number of phidk cycles phi^mfiie should remain 
high. 


0x94 


ProfileNum 


16 


0x0000 


Number of profile pulses per One time. 


0x98 


FfdkPro 


14 


0x0000 


Number of phfdk cycles pW^teync/ falling edge and 

phLfrdk pulse generation 


Ox9C 


FrcOcLow 


14 


0x0000 


Number of phidk cycles phiJMk should remain 
low. 


OxAO 


FrdkHIgh 


14 


0x0000 ' 


Number of phidk cycles phLMk should remain 
high. 


0xA4 


FrdkNum 


16 


0x0000 


Number of f^i^ftdk pulses per line time. 


Worfcing Registers 


0xA8-0xAC 


UneOotCnt 


2x16 


0x0000 


Indicates the numt>er of dot processed In the cur> 
rent line 

Bus 0 - Printhead Channel A 
Bus 1 - Printhead Channel B 
(Read Only Registers) 


OxBO 


UneCnt 


32 


0x0000 
_0000 


Indicates the number of lines processed in this page 
(Read Only Register) 



-.^^ ^w.u.QM^ttuv/iA lAi^ rxix i/Lui^fk biuv&cu <u pciK raics Dui scvcrai diocks in cne rm are 

clocked by different and asynchronous clocks. Configuration values are not re-synchionized, it is therefore 
important that the Go register be set to zero while updating configuration values. This prevents logic Irom 
entering unknown states due to metastable clock domain transfers. 

Some registers can be written to at any tune such as the direct CPU control registers (PrintHeadCpuIn, 
FrifuHeiuiCpuDir, PnntHeadCpuOut and FrintHeadCpuCtrl)^ the Go r^ter and the PrintStart register. 
All registers can be read from at any time. 

When one of the direct CPU control registers are written to the configuration registers block generates a 2 
cycle pulse {cpujo^wr) which is used to transfer the pin control signals from the pclk donudn to the pkiclk 
domain. The cpujo^wr signal is a delayed version of the write enable from the CPU. 



32.9.4 Dot counter 

The dot coimter keeps a running count of the number of dots fired for each color plane. The covinters are 
32 bits wide and will saturate. When the CPU wants to read the dot count for a particular color plane it 
must write to the DotCountSnap register. This causes all 6 running coimter values to be transferred to the 
DotCount registers in the configuration registers block. The running counter values are reset * 
// reset if being snapped 
if <dot_cnt_snap == 1) chen{ 

dot.count[5:0] «s accuiiudot;_count fS :0) 

accuziL.dot_count [5:0) = 0 

) 

// update the counts 

for <color=0; color < 6; color* +) ( 

if (accuin_dot__count [color 1 i= OxfffC_ffff) { 
// data valid, first dot stream 

data_valid = ( (phi_llu_ready [0 J == 1) AND (Hu_phi_avail ( 0) 1)) 
if ((data. valid == 1) AND (lluj»hi_data [0) [color J == 1)) then 
accunv^dot^count (color] 
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// data valid, second doc stream 

data_valid = ( <phi_llu.ready [1] 1} AND (llu^hi_avail tl) == 
if ((data.valld == 1) AND (llu_phi.data(l) (color! == 1)) then 
accunudo t_count ( col or ] ^+ 

> 



1) ) 



32.9.5 Sync generator 



The sync generator logic has two modes of operation, master and slave mode. In master mode (configured 
by the PhiMode register) it generates the Isyncl^o output based on configured values and control triggers 
from the PHI controller. In slave mode it de-glitches the incoming LsynclJ signal, and filters the Isyncl sig- 
nal with the minimum configured period. 



< 



Reset 



ayne #n««i AMP 
count « ltync_pro 



GOUDSsO 



^^ ^SyncPre ^ 



Machine remains in same state by detautt 
All outputs are zero unless otherwise stated 
State Descfiptlonr 
Reset Normal reset slate 

. SyncPra: Count the LsyncPre number of dock cydes 
SyncLow: Count the LsyndJMv rHjmber of dock 



tsynd.oa 1 



CQUnt-^AMOtasI in«>, 

count* ' 

HnajBt. 



11 



^SyncWait ^ 



count «lsyncJow 
lne_St-1 



iiUMTlod 



SiS*^ SyncLow ^ laync<.o«0 ^^^c^ri^^ 



SyncHiQh: Count the l.syncHigh numt>er of dock 
cydes 

SyndMiit Wait for an input Isync pulse 
SyncPertod: Count the LsyncMinperiod numt>er of dock 
cydes 



count ■ lsync.^h 



cxiunt *• l9yfic_/nin_pef1od 



bvne omae — lAWDcfl»mfi,o 



sync_erT -1 



lsynct.o-1 



lb Reset Seals 



Figure 244. Sync generator state diagram 

After reset or a pulse on phi_go_pulse the machine returns to the Reset state, regardless of what state ifs 
currently in. 

The state machine waits until ifs enabled {sync_en^\) by the PHI controller state machine. When 
enabled it can proceed to the SyncPre or SyncWait depending on whether the state machine is configured 
in master or slave mode. In master mode it generates the £ryrtc/ pulses, in slave mode it receives and filters 
the Isyncl pulses from the master sync generator. 

On transition to the SyncPre state a counter is loaded with the LsyncPre value, and while in the SyncPre 
the counter is decremented. When the count is zero the machine proceeds to the SyncLow state pulsing the 
iine^ signal on transition and loading the counter with LsyncLow value. This indicates to the PHI con- 
troller the bne start aligned to the Isynci negative edge. 
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The machine waits in the SyncLow state until the counter has decremented to zero. It proceeds to the Syn- 
cHigh sMe and counts LsyncHigh number of cycles. While in LsyncLow state the lsyncl_o output is set to 
0 and in SyncHigh the IsyncljD output is set to 1. 

When the count is zero and the current line is not the last {lastjine = 0). the machine returns to the Syn- 
cLoMf state to begin generating a new line sync pulse. The transition pulses the line st signal to the PHI 
controller. 

The loop is repeated until the current Une is the last {lastjine and the machine returns to the Reset 
state to wait for the next page start 

In slave mode the state machine proceeds to the SyncWait state when enabled It waits in this state until a 
lsync_pulse is received from the input de-glitch circuit. When a pulse is detected the machine jumps to the 
SyncPeriod state and begins counting down the LsyncMinPeriod number of clock cycles before returning 
to the SyncWait state. On transition from the SyncWait to the SyncPeriod state the line jst signal to the PHI 
controller is pulsed to indicate the line start. While in the SyncPeriod state if a Isync^lse is detected the 
state machine will signal a sync enor (via syne_err) to the PHI controller and cause a buffer undemm 
intermpt. 



32,9.5.1 Lsynclinput de-glitch 



The lsync_i input is considered an asynchronous input to the PHI, and is passed through a synchronizer to 
reduce the possibility of metastable states occuzring before being passed to the de-glitch logic. 

The input de glitch logic rejects input states of duration less than the configured number of clock cycles 
{lsync_deglitch^cnt), input states of greater duration are reflected on the output, and arc negative edge 
detected to produce the Isync^jmlse signal to the main generator state machine. The counter logic is given 
by 

it i lsync_i != lsync_i_delay) then 

cnt a lsync^deglitchL.cnt 

oucput.en c 0 
elsif (cnt «» 0 ) then 

cxxt = cnt 

output_en = 1 
else 

Cftt — 

output_en = 0 



teyncj . 



.. synchpnlziBf 



J=4 



Counter p" 
Logic 



fsync^dogfitch.cnt . 



^ c nt 
J* 



Compare 



Pulse 
Generator 



.output en 



tsync_pulse 



Figure 245. Une sync de-gNtch RTL diagram 



32,9,5,2 Line Sync interrupt logic 

The line sync interrupt logic counts the number of line syncs that occur (either internally or externally gen- 
erated line syncs) and determines whether to generate an intcmipt or not. The number of line syncs it 
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counts before an intemipt is generated is configured by the LineSyncInterrupt register. The inteirupt is dis- 
abled if LineSyncInterrupt is set to zero. 
// implexnent the interrupt counter 
if (phi_^o_pulso «s=l) then 

line_count = 0 
elsif (line_st == 1) AND (line_coxint == 0)) then 

line_count = linecount_int 
elsif ((line^st == 1) AND (line.count 0)) then 

linQ_count — 
// determine when to pulse the interrupt 
if (linesync^int «= 0 ) then // interrupt disabled 

phi_icu_linesync^int - 0; 
elsif ((line_st == 1) AND (Iine_count == 1)) then 

phi_lcu_linefiync_lnt « 1 



32.9.6 Fire generator 



The fire generator block creates the signal profile for the phijrclk and phi^rofile signals to the printhead. 
The profile is based on configured values and is timed in relation to the fire ^nc pulse from the PHI con- 
troller block. 



Reset 



HftsetpRnW oft nuhA^i 

Reset 



pNjbcOc sO 



count « frcStjpiB 



FirePre ^ pN.ficK-o 



1 count a (fcflcjiigt) 

_ . ^ ^ ^ •"Bpaajjoount « be(K_fium 

r RreHIgh^ piii_iieDc-i 



GflUUCsfi 
count — 



COOnTw-Q 

repesLpount- 



count — 



count - frcOOow 



RreLow 



pM.jicik-0 



Machine remains in same slate by deCault 
All outputs are zero unless otherwise stated 
State Description: 
Reset: Normal reset state 

RraPre: Count the FrdkPre number of dock cycles, 
repeat count set to FrdkNum 

BreHigh: Count the FrdkMIgh number of dock cydes 

FIreLow: Count the FirdkLow number of dodc cydes 



Figure 246. Fire generator state diagram 



The fire generator consists of 2 identical state machines for creating the phi Jrclksnd phi^rvfile signals 
respectively. 

The machine is reset to the Reset state vrhen phi^^ulse «»1 or the reset is active, regardless of the cur- 
rent state. 

The machine waits in the reset state until it receives 2ifire_st pulse from the PHI controller. The controller 
will generate 3.fire^t pulse at the beginning of each dot line. On the state transition the cycle counter is 
loaded with the FrclkPre value and the repeat counter is loaded with the FrdkNum value. 
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The state machine waits in the FirePre state until the cycle counter is zero, after which it jumps to the Fire- 
High state and loads the cycle counter with FrclkHigh value. Again the state machine waits until the count 
is zero and then proceeds to the FireLow state. On transition the cycle counter is loaded with the FireLow 
value. The state machine waits in the FireLow state while the cycle counter is decremented. 

When the cycle counter reaches zero and the repeat jcount is non-zero, the repeat_count is decremented, 
the cycle counter is loaded with the FrclkHigh value and the state machine jumps to the FireHigh state to 
repeat the pAi.>c:/^ generation cycle. The loop is repeated until the repeat^count is zero. In such cases the 
state machine goes to the reset state and waits for the next fire _jt pulse. 

When in the Reset state the fire^dy signal is active to indicate to the controller that the fire generator is 
ready. 

32.9.7 PHI controller 

The PHI controller is responsible for controlling all functions of the PHI block on a line by line basis. It 
controls and synchronizes the sync generator, the fire generator, and dat^ath unit, as well as signalling 
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Si 



back to the CPU the PHI status. It also contains a line counter to detennme when a full page has completed 
printing. . 

Reset OR nhl 90 piifam^^i 

\_ 

»^ Reset 



c 



Phi qo=^l 

data^5t « 1 

flnejooum cpage Jen_Dne 



FirstLine 



> 



data Bn=«l 
ilne^oount - 



^Printstart"^ 



data tin — 1 AND 

line count < oaoe Ian Hha 

Bne_count- 



SyncWait ^ 



sync.en 



datsL_st>;r 1 
firo.st B 1 
8ync_sta 1 



LineTrans 



data fin—1 AMn 
Bne taount 1 
Gne.coum- 



fdy 



< 



fltlLgsLfiylS^X 



sync_en«il 



»^Undemin ^ underrun_error »1 



LastUne 



3 



lasullne =1 
syne.en bi 



Figura 247. PHI controller state machine 

The PHI controller state machine is reset to Reset state by a reset or phi_go_puhe = 1 . 

It will remain in nsset until the block is enabled by I. Once enabled the stat^ 
to the FirstLine state, trigger the transfer of one line of data to the printhcad (data_st 1) and the line 
counter wiU be initialized to the page length (PageLenLine). Once the line is transfeired {data Jin from the 
datapath umt) the machine will go to Printstart state and signal the CPU using an interrupt that the PHI is 
ready to begm prmting (phijcu_print_rdy). The line counter will also be decremented. It wiU then wait in 
the Prmtstart state unHl the CPU acknowledges the print ready signal and enables priiiting by writing to 
the PrintStart register. 

The state machine proceeds to the SyncWaU state and waits for a line start condition Qinejt =1) The Une 
c'^^t:^**^^ "^'oJ* different depending on whether the PHI is configured as being in a master or slave 
SoPEC (the PhiMode register). In either case the sync genetator detennines the correct line start source 
and Signals the PHI controller via the line_st signal. Once received the machine proceeds to the LineTrans 
state with the transition triggering the fire generator to start {fire_st). the datapath unit to start (data st) 
and the sync generator to start (.s>7ic_j/). ~ ' 
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While in the Lmeiyans state the fire, sync and datapath unit will be producing line data. When finished 
processing a line the datapath unit will assert the line finished (line Jin) signal. If the line counter is not 
equal to 1 (i.e. not the last line) the state machine will jump back to the SyncfVait state and wait for the start 
condition for the next line. The line counter will be decremented. If the line counter is one then the 
machine will proceed to the LastLine state. 

The LastLine state generates one more line of fire pulses to print the last line held in the shift registers of 
the printhead. Once complete {firejin =1) the state machine returns to the reset state and waits for the 
next page of data. On page completion the state machine generates a phijcu^age Jinish interrupt to sig- 
nal to the CPU that die page has completed, the phijcu^age Jinish will also cause the Go register to reset 
automatically. 

While the state machine is in the LineTrans state (or in FirstLine state and the PHI is in slave mode) and 
waiting for the datapath unit to complete line processing, it is possible (e.g. an excessive PEP stall) that a 
new line start condition occurs but the datfqjath unit is not ready. In this case an undemm enor is gener- 
ated. The state machine goes to the Underrun state and generates a phijcujunderrun interrupt to the 
CPU, The PHI cannot recover from a buffer undemm error, the CPU must reset the PEP bloclcs and re- 
start printing. The phijcujunderrun will also cause the Go register to reset automatically. 



32.9.8 CPU lO control 



The CPU 10 control block is responsible for accepting CPU direct lO control signals from the configura- 
tion registers {aipclk frequency) and transferring them to phiclk frequency. It also accepts the input signals 
from the printhead and re-synchronizes them to the pcik domain, and debug signals from the RDU and 
muxes them to output pins. 

Table 161 contains the direct mapping of configuration registers to printhead lO pins. Direct CPU control 
is enabled only when PrintHeadCpuCtrl is set to one. In normal operation (i.e. PrintHeadCpuCtrl — 0) 
the printhead data pins are always in output mode (phi_phJUita^e = 1)» the phijsynci will be in output if 
the SoPEC is the master, i.e. phijlsyncl_e ^ phi^ode^ and readl will be set high. 

The pseudocode for die CPU lO control is: 

if (printheadLcpu_ctrl «« 1) then // CPU access enabled 
/ / outputs 

phi_ph_data_o [ 0 ] ( 1 : 01 » printhea<a^cpu_out [1:0] 

phi_ph_data_o C 1 ] { 1 : 0 ] =» pr inthead^cpu_out (3:2) 

phi_lsyncl_o o printheacL.cpu_out (4) 

Phi^readl = printheadUcpu_out (5) 

phi_8rclk(l;0] « printhead^cpu_out (7 : 6] 

phi^f rclk = .printhead_cpu_out (8] 

phi_profile = printhead_cpu_out [9) 
// direction control 

phi_j)h_data_e ( 0 ) ( 1 : 0] » pr inthead_cpu_dir (1:0] 

phi_ph__data_e(l) tl: 0] = printheadLcpu_dir (3:2] 

phi_layncl_e b printhead^cpu_dir (4) 
// input assignments 

printhead_cpu_in [ 1 : 0] « synchronize<phi_ph_data_i (0) (1:0)) 

prlnthead_cpu_in C 3 : 2 J = synchroni t e ( phi_ph_data_i { 1 ) ( 1 : 0 ) ) 

printhead_cpu_in 1 5] = synchronize <phi_lsyncl_i (0) (1:01) 
else // normal connections 
// outputs 

phiLph_data.o(Qni:0] n ph^data (01(1 :0] 

phi_ph.data_o 1 11(1:0] « ph^data (1] (1 :01 

phi_lsyncl_o = lsync_o 

phi'^readl » 1 

phi_srclk(l:01 « srclkll:01 

phi^frclk a frclk 

phi^rofile = profile 
// direction control 
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phi_ph_<lata_e[OI tl:0) = 0x3 
phl_ph_data_e[ll tl:0) = 0x3 

phi_lsyncl_e = phi_iaode // depends on Kaster or Slave mode 

// inputs 

lsyncl_i e phi_lsync_i // connected regardless 

// debug overrides any other connections 
if (debug_cntrl[0] == X) then 

phi_frclk » debug.data_out[0] 

phi.readl = pclk 

if (debug^cntrld) == 1) then 

phi^rofile « debug_data_outtl] 

if (debug_cntrlt2] 1) then 

phi_lsyncl_o =5 debug_data_out [ 2 ] 

phi_lflyncl_e = 1 

The debug signalling is controlled by the RDU block (see Section 1 1.8 Realtime Debug Unit (RDU)), the 
10 control in the PHI muxes debug data onto the PHI pins based on the control signals from the RDU. 
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32.9.9 Datapath Unit 



Line Loader Unit (LLU) 



■ - -& 



Dot Order B 



/ "2 / ^2x8 



-V-'-"---- ,rv----v--.-.--- - ■.•4 



t»- bollcnt_:r«: 



wkl^ptflUnetfmfll: . c 



rca<3y(0): 



';-..:modeittol 



. rnldJitfQl.ine finfOI 



Datapath Unit 



Dot Order A 




ilal.inBiglR(i) 



> prinL^e8d.Jrize[01 



print.hoa0jtize(O1 



1^ ^ J pdk domain (160 Mhz) |^ j dodk domain (320 Mhz) i ( phidk domain (106 Mhz) 
Figure 248. Datapath Unit partition 
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32.9.10 Dot order controller 

Reset OR oht oo pulaot^i 



•^ Q Reset ^ 



dot_ofder_rty al 



data st«=i 
doccnurst » 1 



X mode_8aI o dot_order_fnode 
UneStart J gen_enroi o -(mW jjtfon 



mid ot(1;Q]««11 



Machine remains in same state by default 
All outputs are zero unless otherwise stated 

State Description: 

Reset Normal reset state 

Unestart: Start processing first part of the line, wait tor 
both mld_pt to be active 

UneMId: Switch over wait state allow pipeline to dear 

UneEnd: Une end processing watt for both Qne.fin to be 
active 



^ UneMid^ 



mode.sel « dotjorder.mode 
gen^eniOl e 0 
Oon_en(l|«0 



mode^sel » *{dot_ord8r.moda) 
gen.effi|l]«mld^l] 



Figure 249. Dot Order controller state diagram 



The dot order controller is responsible for controlling the dot order blocks. It monitors the status of each 
block and determines the switch over point, at which the connections from odd and even dot streams to 
printhead channels are swapped. 

The machine is reset to the Reset state viien phi_go^)ulse == 1 or the reset is active. The machine will 
wait until it receives a daux_^t pulse from the PHI controller before proceeding to the LineStart state. On 
the transition to the LineStart state it will reset the dot counter in each dot order block via the dotjcnt^rst 
signal. " " 

While in the LineStart state both dot order blocks are enabled (gen_en=\). The dot order blocks process 
data until each of them reach their mid point. The mid pomt of a line is defined by the configured printhead 
size (i.e. print_head_size). When a dot order block reaches the mid point it immediately stops processing 
and waits for the remaining dot order block. When both dot order blocks are at the mid point imid _pt = 
1 1) the controller clocks through the LineMid state to allow the pipeline to empty and immediately goes to 
LineEnd state. 

In the LineEnd state the mode^el is switched and the dot order blocks re-cnabled» in this state the dot 
order blocks are reading data from the opposite LLU dot data stream as in LineStart state. The controller 
remains in the LineEnd state until both dot order blocks have processed a line i.e. line Jin =11. 

On completion of both blocks the controller returns to the Reset state and again awaits the next data^st 
pulse from the PHI controller. When in Reset state the machine signals the PHI controller diat it's ready to 
begin processing dot data via the dot_order_rdy signal. 
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The dot order controller selects which dot streams should feed which printhead channels. The onlcr can be 
changed by configuring the DotOrderMode register. In all cases Channel A and Channel B must be in 
opposing dot order modes. Table 1 58 shows the possible modes of operation. 



Table 165. Mode selection In Dot order controller. 











A 


0 


0 


Even before Odd (EBO mode), even dot stream feeds 
Channel A printhead. first half line. 




0 


1 


Odd before Even (OBE nrxxfe), odd dot stream feeds 
Channel A printhead* first half line. 




1 


0 


Even before Odd (EBO mode), even dot stream feeds 
Channel A printhead, second half line. 




1 


1 


Odd before Even (OBE mode), odd dot stream feeds 
Channel A printhead. second half line. 


B 


0 


0 


Odd before Even (OBE mode), odd dot stream feeds 
Channel B printhead, second half tine 




0 


1 


Even before Odd (EBO mode), even dot stream feeds 
Channel B printhead. second half line. 




1 


0 


Odd before Even (OBE mode), odd dot stream feeds 
Channel B printhead. first half line. 




1 


1 


Even before Odd (EBO mode), even dot stream feeds 
Channel 6 printhead, first half line. 



32.9. W. 1 Dot order unit 

The dot order control accepts dot data from either dot stream from the LLU and writes the dot data into the 
dot bufifer It has two modes of operation, odd before even (QBE) and even before odd (EBO). In the OBE 
mode data from the odd stream dot data is accepted first then even, in EBO mode it*s vice versa. The mode 
is configurable by the DotOrderMode register. 

The dot order unit maintains a dot coimt that is decremented each time a new dot is received from the 
LLU. The dot order controller resets the dot counter to the print Jiecut jsizefJS:OJ at the start of a new line 
via the dot_cnt_rst signal. The dot count is compared with the prindiead size (printJtead^izefJS:OJ 
divided by 2) to deteimine the xrdd point Qnidjt) and the line finish point (line Jin) when the dot counter 
is zero. 

The mid point is defined as the half the number of dots in a particular printhead, and is given by the 
print Jieadjsize bus. 
// define the mid point 

if (dot_cnt [15:01 »» print.head^sizeClS : 1) )then 

ini<|jt = 1 
else 

midjt = 0 

The dot order unit logic maintains the dot data write pointer. Each time a new dot is written to the dot 
buffer the write pointer is incremented. The fill level of the dot buffer is determined by comparing the read 
and write pointers. The fill level is used to determine when to backpressure the LLU (ready signal) due to 
the dot buffer filling. A suitable threshold value is determined to allow for the fiill LLU pipeline to empty 
into the dot buffer. 

The dot order stalling control is given by: 

// detormine the ready/avail signal to use, based on mode select 
if (mode^sel ~= 1) then 

dot.active - llu^hl.avail (0] AND ready 

wr_data = llu_phi.data(0] 
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else 

dot_ftctive = llu_phi_availClI AND readly 

wr_data = lluj)hi_datal 1) 
// update the counters 
if ( detractive == 1) then ( 

vr_en = 1 

wr_adr ++ 

if (dQt_cnt »a 0) then 
I dot.cnt a print_head_size 

else 

dot^cnt — 

) 

The dot writer needs to determine when to stall the LLU dot data stream. A number of factors could stall 
the dot stream in the LLU such as buffer filling, waiting for the mid point, waiting for tiie line finish or the 
dot order controller is waiting for the line start condition from the PHI controller. 
The stall logic is given by: 

/ / determine when to stall the LLU generator 
fill_level = wr_adr - rd«adr 

if (fill^level > (32 - THRESHOLD ) ) then // THRESHOLD is open value TBD 

ready =0 // buffer is close to full 

elsif ( gen_en s=« 0) then 

ready = 0 // stalled by the datapath controller 

else 

ready = 1 . // everything good no stall 
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Machine remains in same state by default 
Alt outputs are zero unless otherwise stated 
State Description: 
Reset: Normal reset state 
SrdkPre: Count the SrctkPre number of dock cycles 
DataGen: Read Une Dot data from buffer 
MarginQen: Generate DotMargtn number of dots 
SrcCkPost: Wait for SrdkBost number of cycles 
count « prfntjtaadjBiza 




Figure 250. Data generator state diagram 

The data geaerator block reads data from the dot buffer and feeds dot data to the printhcad at a configured 
rate (set by the PrintheadRate). It also generates the margin zero data and aligns the dot data generation to 
the synchronization pulse from the PHI controller. 

The data generator controller waits in Reset state until it receives a line start pulse from the PHI controller 
{data_st signal). Once a start pulse is received it proceeds to the SrclkPre state loading a coimter with the 
SrclkPre value. While in this state it decrements the counter. No data is read or output at this stage. When 
the count is zero the machine proceeds to the DataGen state. 

On transition it loads the counter with the printhead size (print Jiead^size). If margining is to be used then 
the configured print_head_fiize should be adjusted by the dot margin value i.e. print Jtead ^ize = 
(physical_pnntjtead_size - (dot^margin * 2)). 

While in DataGen state data is read from the dot buffer and output to the printhead. The counter will dec- 
rement for eveiy dot data word transferred. The exact rate is dictated by the dot buffer fill levels and the 
configured printhead rate (PrintheadRate). 

The generator determines the rate by incrementing a rate counter (rate^cnt) while in the DataGen state. 
The rate counter is allowed to wrap normally. If the bit selected by the rate_cnt in the printjiead_rate bus 
is one data is transferred, otherwise the cycle is skipped. If the PrintHeadRate is set to all zeros then no 
data will ever get transferred. The pseudo-code for the DataGen state is given by; 
// increment the rate count 
irate.cnt 

// detennine i€ data should be read 



3Z9.10.2Data generator 
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// f irsc determine if data is available in buffer 
if (rd.adr 1= wr_adr ) then 

if <print_head_rateCrate_cnt) == 1 ) then 

dot_active «= 1 

9ate.srclK = 1 

rd_adr ♦+ 

dot.data = r4_data 
count — 
else 

dot_active ^ 0 
gate_srclk » 0 

else 

detractive n o 
gate_srcXk o o 

When the counter reaches zero the state machine will jump to the MarginGen state if the configured mar- 
gin value is non-zero, otherwise it will jump directly to the SrclkFost state. On transition to MarginGen 
state it loads the cycle counter with the dot^fnargin value, and begins to count down. While in the Margin- 
Gen state the data generator logic block writes dot data to the prinAead but does not read from the dot 
buffers. It creates zero dot data words for the margin duration. 

When the counter reaches zero the machine jumps to die SrclkFost state, loads the clock counter with the 
SrclkPost value and decrements. When the count is finished the state machine returns to the Reset and 
awaits the next start pulse. Should a line sync arrive before the data generators have completed {fiatajin 
signal) the PHI controller will detect a print error and stall the PHI interface. 



32.9.10.3 Data seriaNzer 



The data serializer block converts 6-bit dot data at phiclk rates (nominally 1 06 MHz) to 2-bit data at doclk 
rates (nominally 320 MHz). 



phldk 



1 



dodk I 
doulata{5:0] 



Invalid ' 



Valld[5:0] 



X 



Valfd(5.-0] 



X 



Invalid 



I 



gate.sfclk 
gate_8rclk_det 
srclk 



"LJ"i_ri_rLrL_rLr 

Figure 251. Data serializer timing 



The srrik is only active when data is available for transfer to the printhead, as enabled by the gatejsrclk 
signal. The data rate mechanism in the data generator block will mean that data is not transferred 'to the 
printhead on every pAjc/A: cycle. Both the dotjdata and ga/e_^/tr/Jt signals arc clocked out by the phiclkznd 
can only change on the rising of phiclk. 
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The data serializer block allows easy separation of clock gating and clock to logic stnictuies from the rest 
of the PHI interface. All registers in the block are clocked at doclk rates. 



pheatf_8Nap • 

doCdata[0H5:0] . 

docdata[ip;0] • 



ph(dk- 
phLseriaLonfer- 



Mux Logic 



Lswap- 
Oat».8idk[0) - 

0at6_srdk[1] - 
doclk 



d<n,data|i:( 



dat„data[3:21 ^ 



dot_datefS:41 ^ 



mux 



gate srdk del 



r>> 



ph.data(l:0] 



r 



» srdk 



Figure 252. Data serializer RTL Diagram 

The mux logic determines which data bits from the dotjiata bus should be selected for output on the 
pkjiata to the printhead. The selection is dependent on the phiclkedge. 

if (phiclk == 1) then 

mux^ael 1 
elfiif ( m\ui_sel 2 ) then 

mux^sel ^ 0 
else 

The dot data serialization order can be configured by PhiSerialOrder register. If the PhiSerialOrder is zero 
the order is dot[l:0], then dot[3:2J then dot[5:4]. If fiie register is one then the order is dot[5:4J. dotf3:2J, 
dot[I:OJ. 
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Package and Test 
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33 



Test Units 



33.1 JTAG INTERFACE 

A standard JTAG (Joint Test Action Group) Interface is included in SoPEC for Bonding and 
purposes. The JTAG port will provide access to all interna! BIST (Built In Self Test) structures. 

33.2 Scan Test I/O 

The SoPEC device will require several test lO's for running scan tests. In general scan in and sci 
will be multiplexed with functional pins. 

33.3 . Analog Test Units 

33.3.1 USB PHY Testing 



The USB phy analog macro, wHl contain built-in in test structure, which can be access by either the CPU 
or through the JTAG port 



33.3.2 



Embedded PLL Testing 

The embedded clock generator PLL will require test access from JTAG port. 
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34 SoPEC Pinning and Paclcage 



34.1 Overview 

It is intended that the SoPEC package be a 100 pin LQFP. Any spare pins in the package may be used by 
increasing the number of available GPIO pins or adding extra power and ground pin. The pin list shows the 
minimum pin requirement for the SoPEC device. 



Table 166. SoPEC Pin Ust 



Clocks and rssata 


\ 






Mi 




xtalln 


1 




TBO 


N/A 


xtalin 


Crystal Input pbi 


xtalout 


1 


o 


TBD 


N/A 


xialotft 


Crystal output ptn 


rasaCn 


1 


1 


Lvm. 


2.5V 


reseLn 


Asynchronous active low reset 


Prifithead Interfac 


e 


ph.data[0]p] 


2 


o 


LVDS 


3.3v 


phLph..data_o[0][0] 


Dot data lor colors 0-2 tor Printhead 0. 
Using differential signaOlng 


1 


LVTTL 


3.3V 


phi_ph_dato_|0] 


Input mode bit used for nozzle test 
result printhead 0 




2 


o 


LVDS 


3.3v 


phLph_<fata_o[onij 


Dot data for colors 3-5 for Prtittiead 0. 
Using differential signaUIng 


1 


LVTTL 


3.3v 


phi^_dataj(l) 


Input moda btt used for temperature 
data prinmead 0 


phjdaia[1]I0j 


2 


o 


LVDS 


3.3V 


phUpiudato_o[1fiO] 


Dot data tor colofs 0-2 for Printhead 1. 
Using differential signalling 


1 


LVTTL 


3.3v 


phl_ph_dataai] 


Input mode bK used fornozzle test 
result printhead 1 


ph.{lata(1I[i] 


2 


o 


LVDS 


3.3V 


pW-Ph-data_o(1K1l 


Dot data for colors 3^ for Printhead 1. 
Using differential signaliino 


1 


LVTTL 


3.3V 


phLpTudata^^ll 


Input moda bit used for temperature 
data printhead 1 


srelk(0] 


2 


o 


LVDS 


3.3V 


phuBrcO([Oi 


Difiscential dot data shift dock for prim 
headO 


8rdK(l) 


2 


o 


LVDS 


3.3v 


pM.snA[1] 


Dmamntial dot data shift dock for prim 
head 1 


rea<fl 


1 


o 


LVTTL 


3.3v 


phLrsadl 


Common Print head mode control 


frdk 


1 


o 


Lvm. 


3.3V 


ptiijidk 


Common Fire pattern shfft dock, needs 
to toggle once per fire cyde 


profile 


1 


o 


LVTTL 


3.3V 


phLprofOe 


Common Pulse profile far all ookxs 


teynd 


1 


o 


LVTTL 


3.3v 


phLlsynd_o 


Line Sync output from Master to Slaves 




LVTTL 


3.3v 


phLteyndJ 


Line Sync input to Slaves from Master 


USB Connections 




usbd 


2 


I/O 


Dlffererv 
tial 


3.3v 


Db-ect Phy Connection 


US8 differential data 


JTAG 




ttfo 


1 


o 


CMOS 


2.5v 


tdo 


JTAG Test data out port 


fms 


1 


1 


CMOS 


2.5v 


tms 


JTAG Tost mode select 


tdl 


1 


1 


CMOS 


2.5v 


t(fi 


JTAG Test data In port 


tck 


1 


1 


CMOS 


2.5v 


tck 


JTAG Test access port ctock 


Genera! Purpose lO 
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Table 166. SoPEC Pin List 



















u 


CMOS 


2.5v 


gpio,oC3:0) 


Motor control pins / general purpose 
Output 






1 


CMOS 


2.5v 


gpto.^3:0) 


General purpose Input 


gpioC7:4| 


4 


O 


Hfgh 
Drive 
CKIOS 


2.5V 


gp)0.o(7:4l 


LED driver pirts / general purpose Out- 
put 






1 


CMOS 


2.5v 


gpioJf7:4J 


General purpose Input 


flpfa(11SI 


4 


O 


Open cot* 
lector 


2.5v 


gpfo.o[11:8] 


LSS inteilaoe pins / genetal purpose 

Output 








CMOS 


2^v 


gpioJ[ll:e] 


LSS inteilaoe pins / general purpose 
Input 


gpio[13:12] 


2 


o 


CMOS 


2.5V 


gpfo.o(13:12] 


iSi interfaoe pms / general purpose 
Output 






1 


CMOS 


2.5v 


gploJI13:l2J 


ISI interface pins / gefMral purpose 

input 


Test Pins 










test_enable 


1 




CMOS 


2.5v 


TBO 


Test Enable 


generic^tsst 


S 


I/O 


CMOS 


2.5v 


TBD 


Generic test pin, function undefined 


Total Signal 
Pins 


45 
















Power Pins 








gnd 


16 


1 


Power 


N/A 


grtd 


gnd 


vdd 


10 


1 


Power 


N/A 


vdd 


vdd i.5v. core voltage 


vdd250 


3 


1 


Power 


N/A 


vdd250 


vdd 2.5V.IO voltage 


vdd330 


5 


1 


Power 


N/A • 


ydd330 


vdd 3.3v. ID vodage 


Total Pins 


81 
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Memjet Printhead 



i3 
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35 Memjet Printhead 

This secftion is quoted verbatim from SoPEC/MoPEC Bilithic Printhead Reference document [10]. 

35.1 Background 

Silverbrook*s bilithic Memjet™ printheads are the target printheads for printing systems which will be 
controlled by SoPEC and MoPEC devices. 

This document presents the format and structure of these printheads* and describes the their possible 
airangements in the target systems. It also defines a set of terms used to differentiate between the types of 
printheads and the systems which use them. 

35.2 Companion Documents 

Currently, this document is only concerned with the structure of the printheads and their systems, with 
regard to the way in which dot data is loaded. 

Refer to the Bilithic Printhead Specification [2] for the complete description of the functionality of these 
devices. 

This document relies on certain definitions and details presented in Bilithic Printhead Specification [2]. 

35.3 Definitions 

This doctmient presents terminology and definitions used to describe the bilithic printhead systems. These 
terms and definitions are as follows: 

• Printhead Type * There are 3 parameters which define the type of printhead- used in a system: 

• E>irection of the data flow through the printhead (clockwise or anti-clockwise» with the printhead 

shooting ink down onto the page). 

• Location of the left-most dot (upper row or lower row, with respect to V^ ), 

• Printhead footprint (type A or type B, characterized by the data pin being on the left or the right of 

where is at the top of the printhead). 

• Printhead Arrangement - Even though there are 8 printhead types, each arrangement has to use a spe- 

cific pairing of printheads, as discussed in Section 35.4. This gives 4 pairs of printheads. However, 
because the paper can flow in either direction with respect to the printheads, there are a total of eight 
possible arrangements, e.g. Arrangement 1 has a Type 0 printhead on the lef^ with respect to the 
paper flow, and a Type 1 printhead on the right Arrangement 2 uses the same printhead pair as 
Arrangement 1, but the paper flows in the opposite direction. 

• Color Q is always the first color plane encotmtered by the paper. 

• Dot 0 is defined as the nozzle which can print a dot in the left-most side of the page. 

• The Even Plane of a color corresponds to the row of nozzles that prints dot 0. 

Note that throughout this document, where the various printheads and systems are presented, the print- 
heads always shoot ink down onto the page. 

Figure 253 shows the 8 different possible printhead types. Type 0 is identical to the Right Printhead pre- 
sented in Figure 3 in [2], and Type 1 is the same as the Left Printhead as defined in [2]. 



Doc: SoPEC_hardware_deslgn S3 Proprietary Document ^gQ Nov 2002 

Version: 2.3 Page 538 



SoPEC : Hardware Design 



WMe theprintheads shown in Figure 253 look to be of equal width (having the same number of nozzles) it 
is important to remember that in a typical system, a pair of unequal sized printheads may be used. 



O m O O 



-G-G- 



Color n 



-e-e 




e-©- 



o o o 



Color n 
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Type 0 printhead 

± 



Type 1 printhead 



CX3 O 



-©-© 



Color n 



« o o o 
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Color n 
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O OO 
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Type 2 printhead 



Color n 



OO Q 



Type 4 printhead 



O O [[ q O O 
O O o | Ol O O 



Type 3 printhead 



o o » o 



Color n 



Type 5 printhead 

V4. 



Q O 



Color n 



OO o 



o o € ) I a O Q 

) O C &0 o 



Color n 



O o > o 



O O Q < 



Type 6 printhead Type 7 printhead v 

Figure 253. Printhead Types 0 to 7 

Table 167 defines the printhead pairing and location of the each printhead type» with respect to the flow of 
paper, for the 8 possible arrangements 



Table 167. Definition of the different printhead arrangements 





MhRrintHeadjofl^ 

^vitK!lr^ifpN^;t^|^^ 
^^^^^^^pe^^^ 


9 




Afrangement 1 


TypeO 


Type 1 


Arrangement 2 


Typel 


Type 0 


Arrangement 3 


Type 2 


Type 3 


Annangement 4 


Type 3 


Type 2 


Anangement 5 


Type 4 


Type 5 


Arrangement 6 


Types 


Type 4 


Anangement 7 


Type 6 


Type 7 


Arrangement 8 


Type 7 


Type 6 
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35.4 BiUTHic Printhead Systems 

When using the bilithic printheads. the position of the power/gnd bars coupled with the physical footprint 
of the printheads mean that we must use a specific pairing of printheads together for printing on the same 
side of an A4 (or wider) page, e.g. we must always use a Type 0 printhead with a Type I printhead etc. 

While a given printing system can use any one of the eight possible arrangements of printheads, this docu- 
ment only presents two of them. Arrangement 1 and Arrangement 2. for purposes of iUustiation. These 
two arrangements are discussed in subsequent sections of this document. However, the other 6 possibilities 
also need to be considered. 

The main difference between the two printhead airangements discussed in this document is the direction 
of the p^er flow. Because of this, the dot data has to be loaded differently in Arrangement 1 compared to 
Arrangement 2, in order to render the page correctly. 



35,4.1 Example 1: Printhead Arrangement 1 

Figure 254 shows an Arrangement 1 printing setup, where the bilithic printheads are arranged as follows: 

• The Type 0 printhead is on the left with respect to the direction of the paper flow. 

• The Type 1 printhead is on the right. 
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Type 0 Printhead 



Type 1 Printhead 




Gnd 



The printheads are facing downwards. 
The ink is being shot down onto the page Direction 

' of Paper Flow 



Figure 254. Identification of printheads nozzfes and shift-reglster sequences for printheads in 

Arrangement 1 

Table 168 lists the order in which the dot data needs to be loaded into the above printhead system to 
ensure color 0-dot 0 appears on the left side of the printed page. 



Table 168. Order in which the even and odd dots are loaded for printhead Arrangement 1 







^fvfraHto^n'tfi^^gfn;^ 


Odd 


Loaded second in 
descending order. 


Loaded first in 
descending order 


Even 


Loaded first in 
ascending order. 


Loaded second in 
ascending order. 
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J3 



Figure 255 shows how the dot data is demultiplexed within the printheads. 



Data[l]. 



Data[0]. 



Demux^ 





Type 0 Printhead TVpe 1 Printhead 
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n 




-Data[0] 



-Data[l] 



Figure 255. Demultfplexing of data within the printheads in Arrangement 1 

Figure 256 and Figure 257 show the way in which the dot data needs to be loaded into the printheads in 
Arrangement 1 , to ensure that color 0-dot 0 appears on the left side of the printed page. 



SrClk 



Figure 256. Signalling for a Type 0 printhead in Arrangement 1 



SiClk nJTJTJTJTJTJTJnjTJ^^ 

Rgure 257. Signaliing for a Type 1 printhead in Arrangement 1 



35.4.2 Example 2: Printhead Arrangement 2 

Figure 258 shows an Arrangement 2 printing setup, where the bilithic printheads are arranged as follows: 

• The Type 1 printhead is on the left with respect to the direction of die paper flow. 

• The Type 0 printhead is on the right 
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S5 



The printheads are facing downwards. 
The ink is being shot down onto the page. 
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Figure 258. Identification of printheads nozzles and shift-register sequences for printheads in 

Afrangement 2 

Table 169 lists the order in which the dot data needs to be loaded into the above printhead system, to 
ensure color 0-dot 0 s^pears on the left side of the printed page. 

Table 169. Order in which the even and odd dots are loaded for printhead Arrangement 2 





ISTyRSP'Pnnthead^w 


^^^^^^^^ 


Odd 


Loaded first In 
descending order. 


Loaded second in 
descending order. 


Even 


Loaded second in 
ascending order. 


Loaded first In 
ascerKiing order. 
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Figure 2S9 shows how the dot data is demultiplexed within the printheads. 
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Figure 259. Demultiplexing of data within the printheads in Arrangement 2 

Figure 260 and Figure 261 show the way in which the dot data needs to be loaded into the printheads in 
Arrangement 2, to ensure that color 0-dot 0 appears on the left side of the printed page. 

DatatO] 

Data[l] @©(E!^^^iEy©^i^^lfJ^^^^^ 
SrClk nJTJTJTJTJTJTJTJX/-^^ 




Figure 260. Signalling for a Type 0 printhead in Arrangement 2 



Data[0] 
Data(l] 

SrClk nJOJ-UnJOJTJTJTJT^^ 

Figure 261. Signaliing for a Type 1 printhead in Arrangement 2 

35.4.3 Conclusions 

Comparing the signalling diagrams for Arrangement 1 with those shown for Airangement 2, it can be seen 
that the color/dot sequence output for a printhead type in Arrangement 1 is the reverse of the sequence for 
same printhead in Arrangement 2 in terms of the order in which the color plane data is output^ as well as 
whether even or odd data is output first. However, the order within a color plane remains the same, i.e. odd 
descending, even ascending. 

From Figure 262 and Table 1 70, it can be seen that the plane which has to be loaded first (i.e. even or odd) 
depends on the arrangement. Also, the order in which the dots have to be loaded (e.g. even ascending or 
descending etc.) is dependent on the arrangement 
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If the device controlling the printheads can re-order the bits according to the following criteria, then it 
should be able to operate in all the possible printhead anangements: 

• Be able to output the even or odd plane first 

• Be able to output even and odd planes in eidier ascending or descending order, independently. 

• Be able to reverse the sequence in which the color planes of a single dot are output to the printhead. 
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Figure 262. All 8 Printhead Arrangements 



Table 170. Order in which even and odd dots and planes are loaded into the various printhead 
arrangements 









Arrangement 1 


Even ascending loaded first 
Odd descending loaded second 


Odd descending loaded first 
Even ascending loaded second 


Arrangement 2 


Odd descending loaded first 
Even ascending loaded second 


Even ascending loaded first 
Odd descending loaded second 
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Table 170. Order In which even and odd dots and planes are loaded Into the various printhead 
arrangements 









Airangement 3 


Odd ascending loaded first 
Even descending loaded second 


Even descending loaded first 
Odd ascending loaded second 


Arrangement 4 


Even descending loaded first 
Odd ascending loaded second 


Odd ascending loaded first 
Even descending loaded second 


Arrangement 5 


Odd ascending loaded first 
Even descending loaded second 


Even descending loaded first 
Odd ascendir^ loaded second 


Arrangement 6 


Even descending loaded first 
Odd ascending loaded second 


Odd ascending loaded first 
Even descending loaded second 


Arrangenient 7 


Even ascending loaded first 
Odd descending loaded secorKf 


Odd descending loaded first 
Even ascending loaded second 


Arrangement 8 


Odd descending loaded finst 
Even ascending loaded second 


Even ascending loaded first 
Odd descending loaded second 
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